THE ANNALS 
of 
MATHEMATICAL 
STATISTICS 


Tue OFFICIAL JOURNAL OF THE INSTITUTE OF 
MATHEMATICAL STATISTICS 


VOLUME XVI 





Maehy-feen: 
Lisrary 


THE ANNALS 
OF MATHEMATICAL STATISTICS 


EDITED BY 
S. S. WILKS, Editor 
C. C. CRAIG W. FELLER J. NEYMAN 


ALLEN T. CRAIG THORNTON C. FRY WALTER A. SHEWHART 
W. EDWARDS DEMING HAROLD HOTELLING A. WALD 


WITH THE COOPERATION 


Wiiuiam G. CocHRAN Pau. S. DwyER Wi.LuraM G. Mapow 
J. H. Curtiss CHURCHILL EISENHART ALEXANDER M. Moop 
J. F. Day Pau R. HaLmos Henry ScHEFFE 
Haro.p F. DopGre Paut G. Hore. JacoB WoLFOwWITZ 


The ANNALS OF MATHEMATICAL Statistics is published quarterly by the 
Institute of Mathematical Statistics, Mt. Royal & Guilford Aves., Baltimore 2, 
Md. Subscriptions, renewals, orders for back numbers and other business com- 
munications should be sent to the ANNALS OF MATHEMATICAL Statistics, Mt. 
Royal & Guilford Aves., Baltimore 2, Md., or to the Secretary of the Institute 
of Mathematical Statistics, P. S. Dwyer, 116 Rackham Hall, University of 
Michigan, Ann Arbor, Mich. 

Changes in mailing address which are to become effective for a given issue 
should be reported to the Secretary on or before the 15th of the month preceding 
the month of that issue. The months of issue are March, June, September 
and December. Because of war-time difficulties of publication, issues may often 
be from two to four weeks late in appearing. Subscribers are therefore requested 
to wait at least 30 days after month of issue before making inquiries concerning 
non-delivery. 

Manuscripts for publication in the ANNALS OF MATHEMATICAL STATISTICS 
should be sent to S. S. Wilks, Fine Hall, Princeton, New Jersey. Manuscripts 
should be typewritten double-spaced with wide margins, and the original copy 
should be submitted. Footnotes should be reduced to a minimum and whenever 
possible replaced by a bibliography at the end of the paper; formulae in footnotes 
should be avoided. Figures, charts, and diagrams should be drawn on plain 
white paper or tracing cloth in black India ink twice the size they are to be 
printed. Authors are requested to keep in mind typographical difficulties of 
complicated mathematical formulae. 

Authors will ordinarily receive only galley proofs. Fifty reprints without 
covers will be furnished free. Additional reprints and covers furnished at cost. 

The subscription price for the ANNALS is $5.00 per year. Single copies $1.50. 
Back numbers are available at $5.00 per volume, or $1.50 per single issue. 


COMPOSED AND PRINTED AT THE 
WAVERLY PRESS, Inc. 
BaLTIMoRE, Mb., U.S. A. 





ay 





INSUR. LAB. 


THE ANNALS 
of 
MATHEMATICAL 
STATISTICS 


(FOUNDED BY H. C. CARVER) 


THe OFFICIAL JOURNAL OF THE INSTITUTE 
OF MATHEMATICAL STATISTICS 


Contents 


The Approximate Distributions of the Mean and Variance of a 
Sample of Independent Variables. P. L. Hs 

‘Sampling Inspection Plans for Continuous Production which In- 
sure a Prescribed Limit on the Outgoing Quality. A. Wap 
AND J. WoLFow!ITz 

The Expected Value and Variance of the Reciprocal and Other 
Negative Powers of a Positive Bernoullian Variate. Frep- 
PRICK F. STEPHAN 

Random Walk in the Presence of Absorbing Barriers. M. Kac.. 

On = rn of Observation Data into Distinct Groups. 

Vv 

On an Extension of the Concept of Moment with Applications to 
Measures of Variability, General Similarity, and Overlapping. 
Mion pa Siiva RopRiGcvuss. - 

On a Problem of Estimation Occurring in Public Opinion Polls. 
Henry B. Mann 


Notes: 
A Combinatorial Formula and its Application to the Theory of Proba- 
bility of Arbitrary Events. Kar-Lat Cuune anp Lizetz C. Hav. 
On the Mechanics of Classification. Cari F. Kossack 
Note on an Identity in the Incomplete Beta Function. T.A.Bancrort 


News and Notices 

Annual Report of the President of the Institute 

Annual Report of the Secretary-Treasurer of the Institute 

Report of the Membership Committee of the Institute 

Progress ere of the Committee on Post-War Development of the 





THE ANNALS 
OF MATHEMATICAL STATISTICS 


EDITED BY 
8S. 8. WILKS, Editor 
C. C. CRAIG W. FELLER J. NEYMAN 


ALLEN T. CRAIG THORNTON C. FRY WALTER A. SHEWHART 
W. EDWARDS DEMING HAROLD HOTELLING A. WALD 


WITH THES COOPERATION OF 


Wiui1am G. Cocnran Pav. 8. Dwrer Wituram G. Mavpow 
J. H. Curtiss CHURCHILL EISENHART ALExanvER M, Moop 
J. F. Daty Pavut R. Hatmos Hanry Scuerrt 
Hagowip F. Dopar Paut G. Hor. Jacos Wo.rowiTz 


The ANNALS OF Matuematicat Sratistics is published quarterly by the 
Institute of Mathematical Statistics, Mt. Royal & Guilford Aves., Baltimore 2, 
Md. Subscriptions, renewals, orders for back numbers and other business com- 
munications should be sent to the ANNALS OF MaruematicaL Statistics, Mt. 
Royal & Guilford Aves., Baltimore 2, Md., or to the Secretary of the Insti- 
tute of Mathematical Statistics, P. 8. Dwyer, 116 Rackham Hall, University of 
Michigan, Ann Arbor, Mich. 

Changes in mailing address which are to become effective for a given 
issue should be reported to the Secretary on or before the 15th of the 
month preceding the month of that issue. The months of issue are March, 
June, September and December. Because of war-time difficulties of publica- 
tion, issues may often be from two to four weeks late in appearing. 


Subscribers are therefore requested to watt at least 30 days after month of issue 
before making inquiries concerning non-delivery. 


Manuscripts for publication in the ANNats or Matuzmaticat Sraristics 
should be sent to S. S. Wilks, Fine Hall, Princeton, New Jersey. Manuscripts 
should be typewritten double-spaced with wide margins, and the original copy 
should be submitted. Footnotes should ve reduced to a minimum and whenever 
possible replaced by a bibliography at the end of the paper; formulae in foot- 
notes should be avoided. Figures, charts, and diagrams should be drawn on 
plain white paper or tracing cloth in black India ink twice the size they are to 
be printed. Authors are requested to keep in mind typographical difficulties 
of complicated mathematical formulae. 


Authors will ordinarily receive only galley proofs. Fifty reprints without 
covers will be furnished free. Additional reprints and covers furnished at cost. 


The subscription price for the ANNALS is $5.00 per year. Single copies $1.50. 
Back numbers are available at $5.00 per volume, or $1.50 per single issue. 


ComPossp AND PRINTED AT THE 
WAVERLY PRESS, Inc. 
Bavutiwornm, Mp., U. 8. A. 


Entered as second-class matter at the Post Office at Baltimore, Maryland, under the Act of March 3, 1879 








THE APPROXIMATE DISTRIBUTIONS OF THE MEAN AND 
VARIANCE OF A SAMPLE OF INDEPENDENT VARIABLES 
By P. L. Hsu 
The National University of Peking 


1. Introduction. In this paper we shall study the mean and variance of a 
large number, n (a sample of size n) of mutually independent random variables: 


(1) £1, &,°°°, Sa; 


having the same probability distribution represented by a (cumulative) distribu- 
tion function P(x). The rth moment, absolute moment, and semi-invariant of 
P(x) are denoted by a, , 8, , and, respectively. It is assumed that for a certain 
integer k > 3, B. < © and that a, > 0. Hence there is no loss of generality in 
assuming that 


(2) — > 0, ao = 1. 


The characteristic function corresponding to P(x) is denoted by p(t). 
We put 


(3) = > fe, n=- LD G- 8 


r=1 


a 


(4) F(z) = Pr{Vni <2}, G(x) = Pr (ee < z. 


a—l 
The definition of G(x) implies that ag << © anda,—1>0. The case a —1 =0 
provides an easy degenerated case which will be treated separately (section 4). 
Cramér’s theorem of asymptotic expansion’ reads as follows: 


THEOREM 1. Jf P(x) is non-singular and if B. < & for some integer k > 3, 
then 


(5) F(x) = ®(x) + (a) + RQ) 


where 


1 ° : 
(6) &(r) = val. et dy. 


(zx) is a certain linear combination of successive derivatives®® (2), - - - Apres (x) 
with each coefficient of the form n™” times a quantity depending only on 
k,as,°*+,a@41(1 < » < k — 3) and 
(7) | R(x) | < Q/nk*™™ 
where Q is a constant depending only on k and P(x). -« 

1H. Cramitr: Random Variables and Probability Distributions (1937), Ch. 7. This book 
_ will be referred to as (C). 
1 





P. L. HSU 


In particular, putting k = 3 we get that | F(x) — @(x)| < Qn provided 
P(x) is non-singular and 6s < «. If the condition of non-singularity of 
P(x) be removed, then Liapounoff’s theorem’ furnishes the weaker result: 
| F(x) — (x) |< ABsn log n where A is a numerical constant. 

Very recently Berry’ succeeded in removing the factor log n from Liapounoff’s 
theorem under no other condition than that 6; < «. We state here Berry’s 
theorem: 

THEOREM 2. Jf B3 < «, then 


(8) | F(x) — &(x)| < = 


where A is a numerical constant. 
An essential step in the proof of these results is the selection of a weighting 
function w(x) and the appraisal of the integral 


(9) . w(u){F(u + x) — Ou + x) — V(u + 2)} du 


(¥ = Owhenk = 3). In his book’ Cramér proves Theorem 1 by taking w(u) = 
= (—u)** when wu < 0 and w(u) = 0 when 
(10) u=0 (<a <1) 
and proves Liapounoff’s theorem by taking 


—u?2/2e2 


(11) wu) = ae e 


On the other hand, Berry uses the following weighting function in his proof of 
Theorem 2: 


1 — cos Tu 


(12) w(u) = 


The unfortunate selection of the function (11) accounts for the presence of the 
factor log n in Liapounoff’s theorem. 

Now Cramér’s proof of Theorem 1, based on the integral (9) with w(u) defined 
in (10), makes use of a result on that integral due to M. Riesz. <A more ele- 
mentary proof than this can be devised. In fact, one has only to use, with 
Berry, the function (12) and to adopt his elementary appraisal* of the integral 


2(C), Ch. 7 
3 A.C. Berry: ‘The accuracy of the Gaussian approximation to the sum of independent 
variates.” Trans. Amer. Math. Soc., Vol. 49 (1941), pp. 122-136. This paper will be re- 
ferred to as (B). 
4 Berry proves the inequality (in our notation): 
T(T — a |fW — e | 


| f 1= 808 TF pe + a) — az + 0} dz| < [ ———_—_———————- dl 
| Jmoo 2? 1~ Jo t 





DISTRIBUTIONS OF MEAN AND VARIANCE 3 


(9) in order to obtain the proof of Theorem 1. One of our purposes is therefore 
to give an elementary proof of Theorem 1, without reference to the above- 
mentioned result due to M. Riesz. Section 2 is devoted to this work. 

We ought to add that Cramér’s theorem and Berry’s theorem correspond to 
Theorems 1 and 2 for the case in which the random variables (1) do not follow 
the same distribution. The proof given in Section 2 is adaptable to these more 
general theorems when subjected to appropriate modifications; the: assumption 
of a common distribution function for (1) is only made for the sake of con- 
venience. 

So much for the known results for the approximate distribution of — By a 
purely formal operational method Cornish and Fisher’ obtain terms of successive 
approximation to the distribution function of any random variable X with the 
help of its semi-invariants. It is hardly necessary to emphasize the importance 
of turning Cornish and Fisher’s formal result (asymptotic expansion without 
appraisal of the remainder) into a mathematical theorem of asymptotic expan- 
sion which gives the order of magnitude of the remainder. In this paper we 
achieve this for the simplest function of (1) next to &, viz. the 7 in (3). We do 
not seek to remove the assumption of a common distribution for (1), as there 
will be no practical significance (e.g. in statistics) of » if the variables (1) do not 
have the same probability distribution. Section 3 is devoted to the proof of 
the following theorems: 

THEorEM 3. If as < © and a — 1 — a3 ¥ O (it cannot be negative), then 


. A a6 3/2 
(13) | G(z) — &(x)| < She 


where A is a numerical constant. 
TuroreM 4. Let P(x) be non-singular and let az, < © for some integer k > 3. 
Then 


(14) G(x) = (x) + x@) + Ri), 


where ®(x) is the function (6), x(x) is a linear combination of the derivatives ©’ (x), 
- ,@°*™ (x) with each coefficient of the form n* times a quantity depending only 
on k and az, a4, °** , x2, and 


(B), p. 128. The ‘‘appraisal’’ mentioned here refers to (50) which is contained in B, p. 128. 


But Berry’s appraisal of the integral in the right-hand side of the above inequality is in 
default. He writes 


cle © 
ii - | 2 “i 
[ (2 i” t)tew dt = — = / (1.1 —c)?+c— = e~Pl2 dt 
9 6 2 cle t 
(B, p. 132, line 3) whilst the last integral ought to be 
oo 
/ {(1.1 — c)t® + c — Qct}e-*”!? dt. 
cle 


5 E. A. Cornish and R. A. Fisher: ‘Moments and cumulants in the specification of dis- 
tributions.’? (Revue de 1’Institut International de Statistique (1937), pp. 1-14.) 





P. LL. HSU 


(15) | Ri(x) | < as ifk = 4,5 o0r6 


, 
(16) | Ri(x) | < ae ifk>7 
where Q; and Q; are constants depending only on k and P(x). 

It may be noticed that Theorem 3 is a ‘‘Berryian’”’ theorem about G(x), its . 
characteristic feature being the absence of any condition on the distribution 
function except the two on its moments, and that Theorem 4 is a ‘‘Cramerian”’ 
theorem about G(x), the characteristic feature being the assumption of non- 
singularity of P(a) besides that ax, << «. 

In proving these theorems we have devised a method which is applicable to 
getting similar results about functions other than 7, such as functions com- 
monly used in applied statistics: the higher moments about the means, the 
moment ratios (e.g. K. Pearson’s b; and be), the covariance, the coefficient of 
correlation, and ‘‘Student’s” ¢-statistic. Works on such functions are being 
done by my university colleagues, and the results will be published shortly. 

If € is any of the random variables (1), then 


0 < efa(# — 1) + bE} = Wlay — 1) + 2aba; + 


for all real (a, b). Hence ay — 1 — a3 > 0, and a; — 1 — a3 = O means that 
there is unit probability that € assumes exactly two values. This easily degene- 
rated case is first eliminated in Theorem 3 by the assumption a, — 1 — a3 ¥ 0 
and then considered in section 4. In Theorem 4 the condition ay — 1 — a3 ¥ 0 
is implied since cannot be a random variable of the nature just described owing 
to the non-singularity of P(x). 


2. Lemmas. Throughout this paper A, B, C, etc. will denote positive numeri- 
cal constants; A; , By; (Aim, Bim), ete., will denote positive constants depending 
only on some integer k (integers k and m), and Q; (Qim) will denote a positive 
constant depending only on k (k and m) and the distribution function P(x). 
3, 8, Ox, (Om), Ax (Arm) will denote respectively quantities such that || < 1, 
|e | < A, | 0; | < A; (| on. < Meal, | Ax | < Q: (| Piten | ¢ Qkm). These 
symbols do not necessarily stand for the same quantity at each occurrence. 
Thus 28 = 0, kO;, = 0, ete. In particular any positive functions of k, a3, +++ , ax 
is a Q;. 


1.1. Cramér obtains the asymptotic expansion of the characteristic function 
of the distribution of +/ né, viz. e(e"“W =), when (1) do not have the same distribu- 


e ° ! 1/6 ° ° ° ° 
tion, valid for |t| < Qn”. Since we assume a common distribution for (1), 


t n 
so that the characteristic function is ‘ p (5) , we are able to derive an 


‘ t 
asymptotic expansion valid for |t| < Qix/n. The extension to \p (5. ; 





DISTRIBUTIONS OF MEAN AND VARIANCE 5 


tn \\" ‘ i . : 
it = presents no difficulty. This is done in the following three lemmas, 


of which Lemma 3 contains the final result. 
LEMMA 1. 


(17) log p(t) = = ret)" + 0.6:|t|", for|t| < py!*. 


k—1 *,\r |k 
Proor: Since p(t) = 1+ 2 zo + " _ = 1+ q(t) say, we have, for 
Bi" |t| <1, 
k 


Ws SL 


Hence 
j t 
(18) log p(t) = ipsa (—i## ae t)\” + | a@ [2 


For 1 <j < [3(& — 1)] let us expand each (— aaa to get a polynomial 
q;(t) of degree k — 1 and a remainder 7;(¢). In doing this we regard q(t) formally 
as a polynomial of degree / in ¢t. For this polynomial we have the majorating 
relation 

a(t) << fi, 


whence 


—_ J ° . 1 5 
“ {q(t}? <« ea 


which gives 

(19) |r| < ae Sj Bel [heh < je’ Bl tl’ < AnBe| tf. 
Similarly, 

(20) | g(t) | FP! < AB, | t | *. 

From (18), (19), (20) we obtain 


(21) log p(t) = qj(t) + Ox6:| t\*. 
1<j<[4(k—-1)] 
Since the sum in (21) must equal the sum in (17), the Lemma is proved. 
LemMa 2. Let (1, £2,°°:,m) be a random point with «(¢:) = 0 and 
e(| £: | *) = Bui < © for some integerk >3(i =1,---,m). Let p(t, --+ , tm) 
be the characteristic function. Then for | t:| < m?*"*Byi"*\/n (i = 1, «++ , m) 
we have 


99 l th tin ee > a U, Ox Vi. 
(22) n log p Jn’ marVs = Anie® + pie) 


n r=2 





6 P. L. HSU 


where U, and V, are the rth semi-invariant and the absolute moment respectively of 
> tt; 

Proor: If |t:| < m?**er)/*4/n, then ." < m*”" (xpi: | ti em 

' , = t ] 

(k-l/ksweallk \ m m 

m DBei | ti |) < . Since —-= } isthe value att = —= 
Che'le) Svan 2 oe 1) Va 

the characteristic function of Sti¢;, it follows from Lemma 1 that for ~/n > 
Vi.* we have (22). 

Lemma 3. Let ({1, --* , &m) be a random point with «(¢;) = 0, e(¢7) = 1 and 
e(| ¢: |") = Bes < & for some integer k > 3. Let pi; = (tit) (pix = 137,97 = 1, 

- , m) and the matrix || pi; || be positive definite. Let 


2 
i DL pijtit; 


(23) A= det. | pis |, g(t, 2s. tm) = ¢ t7~t 


Let p(t, , «++ , tm) be the characteristic function. Then there exists a Bim such that 


| | B ‘A 
for |ti|} <= e Av (i = 1,--- , m) we have 


ki 


{p(s . 7) sys f stew 
Tr Vn? ?'Vn/l eltiy ++ 5 tm)tL + Wit, +++, tm) 5 


Oxm 3(k—2) Ne 
nik > Bri 


m 
9 

a = ,2 
—A/4mm-1 t; 


+lef* +... + at zZ. 
where y (iti, +++ , itm) is a polynomial each of whose terms has the form 
1 . v1 . Ym 
ne Ay,..-rq (tt) +++ (ttm), 


withl<v<k—3,3<cm+--- +n < 3(k — 3), anda,,...,,, depending only 
on k and the moments €(ft' --- ™), 3 Sm tes+ tumsk—-—l. [fk = 3, 
then py = 

Proor. If |t:| < mt) ask A 4/n, then | ti | < on Bui” Vn since 
Q<land#ii>1. It ne. ices Lemma 2 and the fact Uz. = <pi;tit; that 


Sean, EN og | 
ip(Je ya) sai e(t, e+ fale 


k-2 ls 
= (i, -*+, bm 4 pam (oe. 


ij! (k— 2)! 


(25) 


where 


qa k= i” Uras eV. 


(26) s= wa 2. (+3) !n™ + nit 





DISTRIBUTIONS OF MEAN AND VARIANCE 7 


Regarding s formally as a polynomial in n™@ let us expand each (j!)~'s’ (1 < 
j < k — 8) to get a polynomial s; of degree k — 3 in n™? and a remainder r; . 
For the formal eer s we have the majorating relation 


(27) 


r3/k 

Vets ce Abs ivr™ A; V ke yilky—4 

s< a Sy ye gt 
= = rin” “Vn & ri nrl2 Vn 


whence 


which gives 


Pa 0 v rrvlk 7 (k—2+2j) /k 
Ak J J k A, I k 


Ir;| < : PVE CARVE | vi) 
ee wail ua vine nik—2) ° 


. i as ° 
Since V;/* n=? < 1 as shown in the proof of Lemma 2, we have 


-ck-242iyie  Akm(2, Bre | ts|*)* 2" 
A; Vi - 


malk—2) ne (k—2) 





Arm(Qy Bus" | t |)" sila mm 3. gee 2t2!* | 4, a 2+2j 





i ni (k—2) Ss mi (k—-2) 


(k —2+2j)/k 3 (k—2) /k 
Since B,; > 1 we have B5**??"* < gi*®'*, Hence 


Ae x ars | t; er 


ni (k—2) 





(28) In| < 


Similarly 


el? ie z Bit 2) /k | t; jaa 
| 

(29) (k — 2)! 4 gy hk-2) 

From (25), (28), (29) we get 


tm 


\p coos Mm oft staf + Det ont ql 


a g(t, +++) bm) {1 + ¥(ih , -++, itm)} 


a Co) i) CE 





where (iti, ---, 2m) stands for =s;. The assertion about (ity, --- , itm) 
announced in the lemma can now be seen without difficulty. It remains to show 
that with suitable B,.,, in the lemma, we have ’ 


= 2 
tel —Ajimm-1 > t? 
o(ty 4 ty tm)e < e i=1 





i.e. 


(30) — 5 De, Pistet +|s|< Do ti. 


-7 5 = 


From (27) we have 


je] < Se vit < 2 (S awl hy 


Akm Am 
< Te (Xi be |u|) < 7a, Dy Base | te 


If we choose Bim < (4m”~Anm) (and Bim < m**"™ in order that the earlier 
results may not be affected), the Aim here coinciding with the last written A,,, 
in (31), we have, for | ti | < BrmBze "AV n, 


(31) 


(32) lel s —— 


On the other hand, if \1, \2, --+ , Am are the latent roots of || pi; || then each 
\; < m since their sum is m. Letting \, be the smallest one we have 


$3) srk > gu Daa oY Pde ded. 


(32) and (33) imply (30). Hence the lemma is proved. 
Let us write down the particular cases m = 1 and m = 2 of (24): 


= ¢ (1 + y(it)) 
(34) 


so = pce anibg tg U og gt gw. op | gt tte ( t| < 4) 


3/k 
k 


th rm ) —4(ti+t5+2ptite) ‘ “i 
ee Sn)) =< {1 + Y(it , tte) } 


+o 2) te Gar (les P + te + --- + el 
(14! < co i ~~. di cess) . 


T 
Bis 


More specially let us rewrite (34) and (35) with k = 3: 


(36) {p (Ya) onan Fe oltre, ( ‘ls — 
ty +.) _ —(t{+t5+2etite) 
\P (Fe Jn}f ~ ° 


—(1—p2) (t3+43)/8 | _ oy 
+S Gul til + bal tafe = , (Inj < 4059s), 
- B3i 


as —(1—p?) (t7+43)/8 
k »y\ e 1 2) 


(37) 





DISTRIBUTIONS OF MEAN AND VARIANCE 9 


In this paper only these last four formulae are needed; they are used in the 
proofs of Theorems 2, 1, 3, 4 respectively. Cases of m > 2 of (24) will be 
needed for the works on other functions alluded to in the introduction. 

1.2. In the following group of lemmas, which culminate in Lemma 7, one 
finds a generalization of the Riemann-Lebesgue theorem, viz. Lemma 6. 

Lemma 4. Let f(x) be a polynomial of degree m > 0, with real coefficients: 


(38) f(z) = > a,x” * (ao ~ 0) 
Then 


1 
A 
if(z) < m : 
(38) if en dxzi< [a0 | 


1 
Proor: It is sufficient to prove the inequality for | cos f(x) dx. Divide 
0 


the interval into A,, sub-intervals in each of whose interior none of the deriva- 
tives f(x) (¢ = 1,---,m) vanishes. It is sufficient to consider one of these 
sub-intervals, say (a, b). Consequently each of the polynomials f(x) are 
monotonic in (a, b). Let 


(39) [= / ' cos f(x) dx. 


Suppose first that f’(2) is positive and increasing fora < « < b. Then 


b , 
tise [ ete as we = 


_ 1 | poi , | 
e+ pees | 9°@ 008 f(a) ae, (ate<bhi < dD), 


by the second mean-value theorem. Hence 


2 
fate 
Now 0 < f’(a + 3) = f’(a + €) — ef"(a + Oe)/2,4 < O< 1. Hence f’(a + 
€) > $ef’(a + Oe). Since f’’(x) is monotonic, we have either f’(a + «) > $f” 
(a+ e)orf’(a+ e) > def’ (a + 4e). In other words, there exists a constant C2 , 
independent of a or e, such that } < C. < land f’(a + e) > 3$ef(a + Cre). 

If f’’(x) > 0, we have, as before f’(a + Coe) > 4Cref’’"(a + Ce), where C3 
is independent of a or « and 4 < C3; < 1. If f’’"(x) < O, then, since 
0< f(a + 2C2.€) om f"(a + Cre) a Coef’”" (a + 6:C2€), 3 < A; < 1, we have 
f"(a + Cre) > —Coref’"(a + 260,Cre). As f’’(x) is monotonic, either f’(a + 
Coe) > —Coref’’(a + Cre) or f(a + Cre) > —Coef’’(a + 2C2€). In all cases 
we obtain f(a + Cre) > Bze | f’’(a + Ce) | , where Bs and C; are independent 
of a or ¢, andi < C3; < 2. Hence f’(a+ ©) > 3Bse'| f’"(a + Cse)|. Arguing 
with +f’’(a + C3e) as we did with f’(a + C2), and so on until we come to f*”’, 


(40) lT|se+ 





10 P. L. HSU 


we obtain f’(a + 6) > Bne™ | f(a + Cre) | = Bue” | ao|. Substituting 
in (40) and putting « = | a|”" we obtain |J| < Anj|ao|” The proof 
presupposes that Cne < b — a. If the reverse inequality is true, then | J| < 
b—a<Cnj\a|}”. Hence the lemma is true for f’(x) positive and increas- 
ing in (a, b). 


b—a 
If f’(x) is positive and decreasing in (a, b), then J = | cos (—f(b — y)) dy, 
0 


—f(b — y) being a polynomial with the leading coefficient +-a) and the first 
derivative f’(b — y), which is positive and increasing. This case reduces there- 
fore to the preceding one. Finally, if f’(x) is negative, we have only to notice 


b 
that [ = | cos (—f(x)) dz. Hence the lemma is proved. 


Lemma 5. Let f(x) be the polynomial (38a), and let a, ¥ 0 for some r,0 <r <m. 
Then 


1 gl 
(41) | et dx| < os ‘ 
| Jo | = [a,|4™ 
Proor: We may assume that |a,| > 1, (41) being trivial if |a,| < 1. If 
r = 0 this reduces to Lemma 4. Suppose that the lemma is true for a), a, 
-+ Gra. Let fi(z) = age” +--+ + ape” ", fox) = f(x) — fil(xz) and 
divide (0, 1) into A, sub-intervals in each of which fi(x) is monotonic. It is 
sufficient to consider one of these sub-intervals, say, (a, b). We have 


I= f cos {fi(x) + fo(x)} dx 


b b 
= / cos fi(x) cos fo(x) dx — / sin fi(x) sin fo(x) dx. 
We have only to consider the integral of cosines, say J. Divide (a, b) into sub- 
intervals in each of whose interior cos fi(x) is monotonic and does not vanish. 
The number of such intervals does not exceed (37) "| fi(b) — fila)| < 
dr) *(| fi(b) | + | fila) |) < 2(jao| +--+ + |a1|). Then, by the second 
mean-value theorem, 


‘iii 
| J | < 2(| ao| fives fe a;_1| ) / cos fo(x) dx (a <b, <b). 


Hence, applying Lemma 4 to fo(x), we get 


| 


Am(|ao| + =++ + Laeal) & Amt ao. 


| a, os 


+ +++ + Jarl) 


| < Se 
(42) I | a | a, pen 





On the hypothesis of induction we have |J | < A, |) ai) °" (i = 0,--+,r— 1). 
If |a;| > |a, "2" for some i < r, then |Z| < Anja, [?™™; if |a:| < 
| a, |"?", then by (42), |Z| < Am|a,-|’””. The proof is therefore complete. 





DISTRIBUTIONS OF MEAN AND VARIANCE 11 


Lemma 6. Let f(x) be the polynomial (38a) and g(x) be summable over (— «©, &). 
Then for every r we have 


(43) lim | e! g(a) dx = 0, uniformly in ai ¥ 1). 


Jar|—0 2 


Proor: By Lemma 5 We have 


1 
lim e! dx = 0, uniformly in a;(i ¥ r). 


|ar|—0 “0 
Hence 
b 
(44) lim e/ dz = 0, uniformly in a;(¢ ¥ r) 
lar|70 “a 
for if a 0 and b ¥ O, then (a, b) is the sum or the difference of two intervals of 
the form (0, c) or (c, 0), and for the latter intervals the transformation x = -tcy 
reduces the interval of integration to (0, 1). 
Let G be any open set of finite measure. Then G is the sum of a sequence 
{IZ,} of non-overlapping intervals. Since =mJ, = mG < ~~, we have 
> ml, < «, n>QN. 
Hence 
| nz) 7, | ~ isa) 9, | 
[ eae) <et+D|[ e dx | 
i\%@ | v=1 Iy | 
which, together with (44), implies 
(45) lim / e’ dx = 0 uniformly in a,;(i ¥ r). 
lar] 72 “G 


Let S be any set of finite measure. Then there is an open set G such that G D S 
and m(G — S) < e. Hence 


; , 
| Caan dx| <ée+ f gf dx) . 
ivs | vG 


Hence, by (45), 


(46) lim [ e! dx = 0 uniformly in a,(i ¥ r). 


|@r|—*00 


Now let h(x) be any positive “simple” summable function, i.e. h(x) = a, > 0 
for xe S (v = 1, 2,---,n) and h(x) = 0 otherwise. Since h(x) is summable, 
each S, must be of finite measure. Hence 


| et h(a) ae < ie a» [ et dx 
| P= a9 = Sy», 


v= 


which, together with (46), implies 


lim e! h(x) de = 0 uniformly in a;(i ¥ 1). 


|@¢|—700 ec 





12 P. L. HSU 


Finally, let g(x) be any summable function > 0. Then by a well-known theo- 
rem’ we have g(x) = lim h,(x), where {h,(x)} is an ascending sequence of positive 
summable simple functions. Hence 


 f e! g(x) dz | < ‘| e! h, (x) de| + [ (g(x) — hn(x)) dx. 
By monotonic convergence the last integral tends to 0 asn — ©. Hence 
(a . | i. | 
if e! g(x) dz} <e+ if e!™ h(a) dx, 


which implies (43). If g(x) is any summable function, we have only to consider 
the customary expression of g(x) as the difference of two non-negative functions. 
This completes the proof. 


Lemma 7. Let P(x) be a non-singular distribution function of a random variable 
X, and let 


ss i> t,z™ 
(47) Oe ee =[ aa 


Then for every r and every positive constant c we have 


(48) lu.b. | p(t:, «++, tm)| <1. 


JtrJac 


Proor: We have P(x) = aPi(x) + aeP2(x), where Pi(x) is absolutely con- 
tinuous, P2 is singular, a; > 0, a; + a2 = 1. Hence 


= iS tper ’ | 
[Pltis a= tm)| Sau] f oo P,(x) dx| + ae. 


By Lemma 6 we may find C > 0 such that 
| p(ti, t2,--+,tm)| <4u+a@<1, ifany |t;| >C. 
Suppose that 
lu.b. p(fi, «°° 


[tr] 2c 


then c < C and we must have 


(49) l.u.b. | p(ts, «++, tm) | = 1. 


cS |tr|SC,|t¢| <C(i¥r) 


Since p(t: , --- , tm) is a continuous function, it must attain its least upper bound 
in any bounded closed set. It follows that there is a point (fj), --- , tm) such 
that’ ? ¥ 0 (| t?| > c) and p(ti,---,t,) = 1. But this implies that the 
distribution of D¢}X° is discrete, i.e. that the distribution of X itself is discrete, 
6H. Kestelman: Modern Theories of Integration (1937), p. 108. 
7 Cf. (C), p. 26. 





DISTRIBUTIONS OF MEAN AND VARIANCE 13 


which contradicts the non-singularity of P(x). Hence (49) is false and (48) is 
true. 


1.3. In his cited work Berry® shows that if F(z) is any distribution function 
and if (x) is the function (6), then there is a constant a such that 


fF i {F(x + a) — (x + a)} dz 


Pé 
> 4/2 16{3 [12% ae — x} 
Tv 0 x 


where 6 = "A 3 lu.b. | F(x) — (x) |. This is easily extended to the following 


lemma, which needs no further proof. 

Lemma 8. Let F(x) be a distribution function and F(x) be a function having 
the following properties: (i) F(x) is bounded for all z, (ii) F(x) ~ lasx— @, 
F(x) > 0 as x + —~, (iii) Fy(x) has a bounded derivative, | F}(x) |< M. Le 


(50) 


6 = aap bub. | F(@) — Fi(2)|. 


Then there exists a constant a such that 


2 {F(a + a) — Fi(x + a)} dx 


Té 
> 2MTs {3 I : Se de - *}. 
; 


1.4. In section 3 we define, for given e, k, \ and z, a function 
(52) G(a, y) = ew if zx <z+ny’, Gz, y) = 0 otherwise. 


The introduction of G(x, y) and the appraisal of its Fourier transform constitute 

the essence of our method of solving the problem of the asymptotic expansion 

of the distribution function G(x). The solution of the same problem about 

other functions of (1) alluded to in section 3 is based on the introduction of 

functions playing the role of G(x, y). We now prove the following lemma: 
LemMa 9. Let G(x, y) be defined by (52) and let 


(53) g(ti, te) = [ [ e =u Ge y) da dy. 
Then 


(51) 


: AA 
(i) g(t, b)| < ik 


2 3 |2 
(ii) |g (ts, te) a (ee al) if k = 3, 
2 


ei/3 e2/3 


@3/2k 


= Ak 214 
(iii) l g(t, te) | < a (>. +4 r lt ). 


8 (B), p. 128. 





14 
PROOF: 
@) ata) |< [ G@,y) de dy =r [ye 
Ro — 30 


(ii) Putting k = 3 we have 


—it\z 0 


eu’ tan] oe or, 


g(t ’ t») re € dy, 


ity —eo 


1 vs , 
lat, 0)| < tg] [wea ay), 
| ty | | to |? | Jee | 
where u(y) = crag - corr. v(y) = e “*”. On integrating by parts we 
obtain 


| os mw | 1 . mr 
(54) | g(t, &) ge Lou" oay) < | lwo lav. 


Elementary calculation establishes that 


eo < "(216d | y |" + 756d | y |" 
1 


+ 336re|y|° + 8d°| 1? | y |? + 12r7| || y)). 


Substituting in (54) and making the transformation y = € “x we get the result. 
(iii) We have 


is . : | 
riGe te) | < rh | e uta] _— e thu") dy | , 
1 | 2 


Integrating by parts twice we obtain 


| g(t , te) | < IEF [ a fe*(y 7 ettihy?) y dy. 


By elementary calculations we get 


lg(ti, t)| < rai / (4k?vey"* + Qh(k + 3)rey™ + 4d°| try? + 2r)e”™* dy 
2\° wo 
which, on the transformation y = . zx, gives the result. 
1.5. We prove afew additional lemmas used in the proof of Theorems 3 and 4. 
Lemma’ 10. Let u(r, +--+, 2m) > 0 be summable in the m-dimensional space 
and let 


(55) v(ti, «++, tm) = [ tee [ e eatime Fmt may pe +) tm) dt +++ Alm. 


® Although the author believes that this lemma is almost classical, a proof is given owing 
to lack of reference. 





DISTRIBUTIONS OF MEAN AND VARIANCE 


If v(ti, +++, tm) is summable in the m-dimensional space, then 
(56) u(t, +++, Im) = ad vee [ eStart ttmtmy(t, +++ tm) dy +++ dtm. 
(20)™ — 09 — 00 


Proor: Except for a constant factor the function u(a2,---, 2m) may be 


regarded as a probability density function. Hence by the well-known inversion 
formula of (55), 


[- -f U(x, °**, Lm) dx, +--+ dim 


(57) egsz;sb;s (t=1,- 


T et tii Ba, en 
ek eal v t 5 is .. lt eee dtm . 
ila. f (II L ) (t ) dt, 


Now u(2i, +--+ , %m) is almost everywhere the symmetric derivative of the inter- 
val function in the left-hand side of (57): 


u(t, -*°, ta) = lim — [- -f U(Y1, ++, Ym) dyr +--+ dym . 
e—0 5 
Zy—e SygSzyte (t=—1,?,-+-,m) 


Hence 


u(t, **",: 355 lim se [.- f 


m ijitje tt; ; a 
(i oot) eititite Fitmzmy (y, yt. tin) dt, — dtm 5 
i ut; 


os 


(58) 


Owing to dominated convergence the order of the limit sign and the integration 
sign in (58) may be inverted: Hence (56) is true. 
Lemma 11. We have 


” trim. n(T — |t\) if|t| <7 
(59) [Le “a _— ‘ if|t| > T. 


Proor: The Fourier transform of the function in the right-hand side of (59) is 
. 2r 
rf f(T — | ¢|) dt = — (1 — cos Tu). 
— T Ue 


Hence (59) follows from (56). 
Lemma 12. 


(60) lef: +--+ + &)F| < Axn*8, 


Proor. As (60) is true for k = 1, let us assume, for induction, that it is true 
for 1, 2,---,k. Then, by symmetry, 


k ; 
ef: + +++ + En)P = nell + + +h) J =n Ee) — 





16 P. L. HSU 


where U = & +.---+ &. Since e(£) 0, we have 


k 
k r rk—-r 
e(&: + --- + 8 ia = © » (‘) e(E1'U* ). 
On the hypotheses of induction we have | «U*”)| < Ax(n — 1)°* "8, < 
Ayn **,_, . Hence 
ler +--+ + &)*" 1 < k1Apn*** 58,41 Ber < Ann? Bais 


Therefore the induction is complete. 


3. Elementary Proof of Theorem 1. 2.1 We have defined 


(61) Fu) = Priva <2), 06) < Te [ay 


with the characteristic functions 


” - @ 9) eo) =. 


Following Berry” we use the equation 

(63) [ F@ - e@}e" a = ee 
Let ¥(it) be the polynomial in (34), and let us define ¥(x) as the function ob- 

tained from y(it) through the replacement of each power (it)” by (—1)’(z). 


Integration by parts shows (—1)”~ [ e'"@” (x) dx = (it)” y(t), whence 


- itr 7 _ W(it)e(t) 
(64) [ W(x)e* dx = ~—— * 
From (63) and (64) we obtain 
(5) [| (F@) — ®@) — v@je dr = FO — COU FO) 
The function (x) defined here is precisely the V(x) appearing in (5) under 
Theorem 1. Our task is to prove that 
(66) | F(a) — (2) — ¥(a)| < 


nk-2) 12° 


Following Berry" we replace x by x + a in (65), getting 


[ {F(x + a) — &(x + a) — W(x + a)je™ dx 


(67) va 
_ 9 — oO +. VEO) 


—it 


10 (B), p. 127, Equation (23). 
11 (B), p. 127. 





DISTRIBUTIONS OF MEAN AND VARIANCE 17 
multiply both sides of (67) by 7 — | t | and integrate with respect to tin (— 7, T): 
af —— (F(z + a) — &(z +a) — W(x +a)} de 

- [ (T — |t)e“"O — oO + ¥@dI 
T —1 
the reversion of order of integration involved is obviously justifiable. Hence 


| "1 008 Te try + a) — Oe + a) — ¥(e + a)} de 


68) 
( <7 [LO — oO + veoH 
—_ 0 t P 


2.2. When in particular k = 3, (68) becomes 


-_ iL oe ([F(@ +a) — &@+a)}dz| < rf Me. 


If we choose a to be the a in (50), the left-hand side of (69) is not less than 


,/f? - {3 r? = 008 F ay — \, i. 4/5 bub.| FO) — &(z)|. 


On the other hand, taking 7 = Ave as in (36) the right-hand side of (69) is 


not greater than 
Al te a= A. 
0 
Hence 
™1—cosz 
. 2 


Now the left-hand side of (70), as a function of 7, is positive and increasing for 
sufficiently large 75, and becomes infinite as 75 — «©. Hence (70) implies that 
Té < A, ie 


lub.|F@) — &@)| <5 = om. 


giving Theorem 2. 


2.3. Coming back to the general case, we see that the function ®(x) + (zx) 
has a bounded derivative: | ®’(z) + W’(x) | < Q;, and also has all the properties 
of the function F;(z) in Lemma 8. On choosing a in (69) to be the a in (51) 
we obtain 








18 P. L. HSU 


(71) — “\< pf Rt + 0 


where 
= Q lub. | F(x) — &(x) — W(z)]. 
Let us take 7 = (A, Bp "-+~/n)*” with A; in accordance with (34). Then 
"\#O — eOU+9@)}] 4, 
t 





0 
(72) T1/(k-2) T 
i ane | + Qn [ a Jit Je say. 
0 bn 
By (34) we have 
(73) I< a | (ht ee HE NeH dt = Qh. 


Also, 
a a) n 7 
ee sie | | p(t/-Vn) |" : pork OIL + VOI g 
(74) Je < Quen att ; dt+ Qn — 
The second term in the right-hand side of (74) is evidently <Q,. The first 
term does not exceed 
(75) Qn T Lub. | pd |" 
t2>Qx 


At this step we make use of the non-singularity of P(x) and apply Lemma 7 
form = 1. We have 


lu.b. |p| = 
t=Q- 
Hence (75) does not exceed Qn? e~%" < Q.. We have therefore 


ew 
(76) Té 3 l = 08? ae — dx a <@. T = Qunt*. 


/ 


Arguing with (76) as we did with (70) we conclude that 
T =n) ° 


lub.| F(z) — &(z) — ¥(x)| < 2 = _@ 


(72) is validfor 7 >1. If 7 < 1, we have only to suppress the term J2. Hence 
Theorem 1 is proved. 


4. Proof of Theorem 3 and Theorem 4. 3.1. In connection with the random 
variables (1), we assume that 6., < © for some integer k > 3 and define 


(77) n= DG - Bs Ge) =I (VEO RD < ah 





DISTRIBUTIONS OF MEAN AND VARIANCE 19 


sa a eee m= j ee 
~~ lee foie gs => 


1 (§ — 1) 5 
78 = = == caine y = . 
( ) /n z V os i 1 ’ Vv n g 


Now, 


where 


Hence 

(79) G(z) = Pr{X — XY’ < 2} 

with 

(80) he plo, 
V n(os — 1) 


Let W be the probability function of the distribution of the random point 
(X, Y) and f(t, , t2) be the characteristic function: 


(81) W(S) = Pr{(X, Y)eS} for every Borel set S in Rz, 
° ° hy te - 
a ityX+iteY ame te nein: 
(82) f(t, te) = ee ) \p (<.. J.) 
(83) p(t, ) = [ Pere, 


Let G,(z) be the distribution function of X. Then 


(84) Giz) — G,(z) = | / dW = K(2), say. 
z<xr<sztry?2 

Let 

(85) K.2) = | | eo aw. 
z<zszthy2 


If we define (for fixed z) the function G(x, y) by 

(86) G(x, y) = ew if zcarsztny’, G(z,y) = 0 otherwise, 
then 

(87) kK.) = [ [ G@,» aw. 


Letting 


(88) [. [ cea, ») de dy = a(t, 8), 








20 P. L. HSU 


we replace x by x — wu in the integral and get 
(89) [ [ e vay — uy, y) dx dy = e*" g(t, tr). 
1 — cos Tu 


2 


obtain, with the help of (59), Lemma 11, 


& [ e ttiz—ttey dz ay [ oe == G(x — y) du 


7 litte aaa  & 2 
ion if | t1 > TF 


Multiplying both sides by and integrating with respect to u we 





(90 


the reversion of order of integration in the left-hand side is obviously justifiable. 
By Lemma 9 the right-hand side of (90) is summable in the whole plane of 
(t:, t). Hence, by Lemma 10, 


[ ey ~ oh 


uw 
(91) a 
= 2 ff @-labote, we" did. 


[ti[ <7 


If we integrate both sides with respect to the probability function W, we obtain, 
on reversing the order of integration, 


[ toe TY a [ [¢@-u, naw 


Ro 


(92) 
=A] f @-\uboG, wp, dnd. 
[ta] S7 
By (86) and (87), 
(93) [ i G(x — u, y) dW = Kut 2). 


Hence 
[ += 00s TY Ku + 2) du = i / / (T ae | ta |g (ts ? te) f(t ’ te) dt, dtz ‘ 
: jéi] ST 
We now take the functions 


—4( ti+t3+2pti te) 


(95) g(t, t) = 








DISTRIBUTIONS OF MEAN AND VARIANCE 21 
and (it; , 72) as in (35), where 


Re a3 
(96) p= [ Sa are SS. 


Since the condition a, — 1 — a3; ¥ 0 is assumed in Theorem 3 and implied in 
Theorem 4, we have |p| <1. Let 








CA —(1/2(1—p? 2+y2—2 
(97) w(x, y) = 5e (1/2(1—p?)) (a2+y pry) 


WT [= - p 
and let y(x, y) be the function obtained from y(it; , it2) through the replacement 
vy tne 
of each power (it;)”! (ite)? by (—1)2*"W,,»,.(2, y) = (—1)"*"2 an wl, Y) 


dx” dy”? 
Since 


(98) w(x, y) — eye [ | ¢ 119-39 O(t, ; tz) dt; dt. ’ 
we have 


vitve 
(99) Wry, (2, y) = os 





i [ (its)”! (ite)”2e ** "2" O(ty , te) dtr dle , 
whence, by Fourier inversion, 
(100) (it;)"* (tte) ”* (ts ; to) = [ [ eftiztitaiy, (2, y) dxdy. 


From the definition of y(x, y) it follows therefore 


cor) ff ee fw, 9) + r(@, w)} dz dy = ol, W){1 + vb, ih)}. 


A comparison of (101) with [ [ eta" QW = f(t, , ts) shows that (94) will 


remain true if K,.(u) be replaced by 


(102) [ [eco w) + v(e, y)) dz dy = Liu), say, 


u<cxrsutry2 


and f(t; , t2) be replaced by g(t , t2){1 + (tt, tte)}. Hence 


[ = = {Ku + z) — Leu + 2)} du 


(103) a i [f (T — |t|)g(ts, te) {f(r t) 


léa| ST 


— (ti, to)[1 + Wits , ite)]} dtidh. 





22 


Let also 


(os) He) = ff {w(e, x) + (2,9) deay, 


z—dy2 <z 


Hye) = | | twee, ») + ¥@, 9)} ae ay, 


(105) L@) = H@) — He) = | [  {w(e,y) +, y)} dedy. 


z<zr<zthry?2 


3.2. We now consider the particular case k = 3 and prove Theorem 3. For 
k = 3 we havey = y = 0 andso 


H(z) = [ff w(x, y) dx dy, 


z—hy2 <z 


(106) 
H(z) = | [ we. y) dx dy = ®(z), 


L(z) = H(z) — M2), 


(107) L.(z) - / / e w(x, y) dx dy, 


zgzszthry? 


[ 1 — cos Tu {K(u + xz) — Leu + 2)} du 


2 uw 
(108) 
= = I (T —|ti|)g(tr, te) {f(tr, te) — o(ts, te) }dtidts . 


[ti] ST 
Now 
K.(u) — L.(u) = {Gu) — &(u)} — {[H(u) — &u)} — [Gi(u) — B(u)} 
— {K(u) — K.u)} + [Lu) — L.(u)}, 


1 eo = utry2 pe - 
0 < H(u) _ &(u) = ariel er ay | e uaa—e") (xr—py dx 
r 0 


a rN 
ee —_ 2 ~2y" = ea a 
s QrvV/1 — p [. ye dy V/2x(1 — p?)’ 
A fr?| 2 —1 fF A 
| Gi(u) o &(u) | ; wal = dP < a Ve by Theorem 2, 
— 30 ie 


=" 
0 < K(u) — Ku) < e(Y°) < Aase by Lemma 12, 
0 < L(tu) — Lu) < Ae. 





DISTRIBUTIONS OF MEAN AND VARIANCE 
Hence 


, = —= {Gu + ») — &(u + A)} du 


u 


1 
a Rn VnV (oa — 1)(1 = p?) 


+ OT I | g(t , ty) +] f(t » le) — (ti, to) | dtydly . 


je, | 57 


(109) = OT jou +o + - 


It is easy to verify that 


sci to 1 “ & ae )" 
(a — Dt via wt ia* 


For the left-hand side of (109) we refer to (50) and take x to be the number a 
therein. Hence 


1 — cos u ( \ - 3/2 
1% 3[" — du -*\< AT) se + — ( ) 
\ ‘oe Vn \ou —-1—03/ } 


(110) + AT rf Jo(ts, ts) |-| f(r, &) — el, t)| dtrdt 


li} S7 |t2| 57 


+ AT I | g(ti, te) | dtrdle . 
jer; ST [lg] >7 


By Lemma 9 (ii) we have 


rt / | g(t, t) | dty dty 
l¢é1] ST, [to] >T7 
sar | | 


[al sT.[te1>7 
WT 72 
ca(n48Pa® 


Hence 


3/2 7 >* 7 Al »? 7* 
<A {este + (——* 2) £a4* ‘ rh 


a er Vn 


4+ AT [f lots, &) [lf — ¢| dt dt. 


Ita] ST. |to| <7 





24 P. L. HSU 


By Lemma 9 (i) with k = 3 we have 
AT 
ais) T ff lol-ly-eldnde<4P ff |p-olauan. 


léi[ ST. [te] s7 lta] S7.[to[ <7 
By (37) under Lemma 3, 
(114) |f — 0] $F Cults? + bel eet for |¢,| < 4A = evn 
Vn er Bi 
with 


©) 2 4 8 
Bu = ft wos dP < a el. (a® + 1) dl 
Sac 


(115) 
se - be = [ |2f aP =m. 


We now take 
oaiuae _ 
(116) ¢«* (2-1-3) Vn; 
8 as 
the A coinciding with that in (114). Then 
A(l — p')Vn > AG - p )(ox — 1)!-/n 
B31 Sag 
_ Alu — 1 — a3s)/u — Ivny > Ala — 1 — a3)'-Vn _ ~~. 
Sag Saz/? 
AL = p)WVa _ Aa — 1 - oda 
Bae (a4 — 1)Bs 
> Alon — 1—a3)'V/n > Ala — 1 — a3)! Vn 
— oreo = fe tonnes 
a, B3 ag 


Hence (114) is true for |t:| < T and |t| < 7. Using this fact on (113) we 
obtain 


(117) 





(118) 


rf] lolls-elduna, 


[és ST. [ee| 57 


st aa. - —— hee ~p2) (¢7+2) dty dts 
é} (a% _ 


< ATA (__w \ 
= e /n im i 1)" Bs 


1 
_ 25/2 

7 - 
= ve (cg (cvs = 1) + Bs (a4 a 1)"”) 


AT 1 
ose (a6-Va4 — 1 + Bs(a4 — 1)”) = i = 


(i a" 


ATox'" 
~ nV elas — 1 — 05)? 





DISTRIBUTIONS OF MEAN AND VARIANCE 25 


Substituting in (112), setting « = (as7’)’ and using (116) we obtain after some 
easy reduction 


Té ‘ 
anf io. “| 
0 uU 


1 ae : . 
<A}j1 == == caneinneere erodes ieee aaa ‘ 
i | * fila = tata ‘oo 1 a 
If n > (a, — 1 — a) ‘as, then the right-hand side of (120) is < A, and so, 
arguing with (120), as we did with (70), we obtain 


(120) 





T Vn ao —1— a3 


For n < (a, — 1 — a3) ‘as, however, the right-hand side of (121) > A(a, — 
1 — a3) ‘as > A and (121) becomes a triviality. Hence Theorem 3 is proved. 
3.3. To prove Theorem 4, we start again with the identity (103). We have 


Ku) — Lu) = {G(u) — H(u)} — {Gi(u) — Ai(u)} 


— {K(u) — K.u)} + {L) — Lu}, 
(123) 0 < K(u) — Ku) < e(Y") < Qe by Lemma 12, 


(124) 0 < Lu) — Law) ef [wee y) + |v@,v)|) de dy < Qe. 


(121) lub. |G(u) — &(u)| < oa Sy ( “ . 
(122) 


Let us show that 
(125) |Gi(u) — Hi(u)| < Qi/n*. 
anction X = 2e3e( H=1\pw op ieibiliceinneas 

The function X = Je 2 ( po =) has the same structure as ~/n é (with 
(as — 1) 3 — 1) playing the role of £;); hence, by Theorem 1, there exists 
an asymptotic expansion of the distribution function G;(u). We shall see that 
the terms of this asymptotic expansion are precisely H,(u), whence (125) follows 
from Theorem 1. 

It is obvious that for the polynomial y(it; , it) in (35) y(it, 0) coincides with 
the polynomial y(t) in (34). Hence the terms of the asymptotic expansion 
of G,(w) are the inversion of e *” {1 + (it, 0)} viz. 


(126) &(u) + = [ dx [ e 4? Yt, O) dt. 


On the other hand, by (104), 
(127) Hu) = @u) + [ae [ v(x, v) dy, 
and by (101) with & = 0, 


(128) [ e* dx [ v(x, y) dy e Wit, 0). 





26 


Inversion of (118) gives 


(129) [vena =f] ya, at 


which establishes the equality of Hi(u) and (126). 
Using (122), (123), (124), (125) on (103) we get 
/ ee a 4 Oe ee (« 4 as) 
u> nitk-2 


—o 


(130) 
tor [ [\on,)|-fG, &) — eG, elt + Wit, i)]| ddl. 


If we expand 


(131) H(u) = I {w(x, y) + y(a, y)} dx dy 


‘ ae ‘ . —}(k—3) 
in powers of n * up to and including the term n° *“ 


Ayn **-”, Hence 
(a3) Hu) = ®(u) + x(u) + n/n, 


where ®(u) + x(u) is the group of terms of the Taylor expansion of (131) in 
powers of n * up to and including the term n-*“~?. From (130) and (132) we get 


, the remainder is obviously 


| — = {Gu + z) — ®(u + 2) — x(ut+a2)} du| 


(133) . 
SQ? (« T aa) + Al, 


where 


(134) J=T [J | g(tr , te) |-|f(t, &) — ol, &){1 + Wit, ite)} | dtidte . 


léi|S7 


We are going to prove that the function x(u) here defined satisfies all the 
requirements of the function x(u) in Theorem 4. The structure of x(u) an- 
nounced in Theorem 4 is easily verifiable. It remains to prove the inequalities 
(15) and (16) satisfied by 


|G(u) — &(u) — x(u) |. 
It is obvious that the function @(u) + x(u) has all the properties of the 
function /’,(u) in Lemma 8, having a bounded derivative | ®’(w) + x’(u) | < Q:. 


Hence, on taking z in (133) to be the number a in (51), the left-hand side of (133) 
does not exceed 


Té 
Q,.T6 (3 [ = du — r), 5 = Q,1.u-b. | Gu) — &(u) — x(u) . 
' 2 





DISTRIBUTIONS OF MEAN AND VARIANCE 


Hence 


Té 
(135) ro(3 18" du — 4) < QT (e+ ota) + Ql. 


In order to appraise J we recall (35) under Lemma 3 (replacing therein each 
Bi by the larger number By8;2, and merging the latter into Q,) 


age Pe — eth WE + Weth aH Hats (2th + 


+ [tafe po OPC” 
for 
(137) lts| < Qn. 


Put T = (Qir\/n)', with Q, here coinciding with that in (137) and then (136) 
is valid for |4| < 7’ and |t.| < T’’. Write 


r=r ff +27 ff +7 ff =n+nen. 


leg] <7)! top eri! lta] <7. |te|>7!!! Tille|jt)| <7 
jtgjsrilt 


By Lemma 9 (i), 


Q. T 
(138) h< “Tail | | If — 1 + V) | dts dts , 
whence, by (136) 


hs MPT L(G dal + + lar) 


—(1—p2) (47+t?)/8 T 
ae dtydlp < —% 


= yih-D 3/2 ? 


(139) 


By Lemma 9 (iii) we have 


In QT [I Tak (Ze 12k 


|ta]<T.Jt2j>TI!! 


+ ae) {f(t » b2)| + oft, b) | 1 + Wits , ite) |} dt; dlp . 


Obviously, 
(140) lub. g(t, )|1+ ¥t, it)| = o™™. 
t2> cpilk—2 


On the assumption of non-singularity of P(x) we have, by Lemma 7, 
lub.  |f(t, t)| Lub \p (S. 2. = 
ub. | ; = u.b. — = 

Sar" |t2|>QnV/n Vn? Vn 


ty 
» ins, |o(Se,0)f - 
orem P Vn 


(141) 





28 


Hence 


— L 1 | é1 
In <Q Te" / / lt ( Vine™ + 3) dl, dt 


| ¢ ve 


leg] <T.Jte|>71/! 
(142) leat |¢2|> 
l-1 (3/2) (1-1) 
n n —ae 
= Q (Sa + ae 
(5 ¢3/2k ) 


ul a ' 
For I; we have | 4:| > T'' = QeVn, and so Lemma 7 is applicable to J; in the 
same manner as to J,. Using Lemma 9 (i) on the factor | g(t: , t2) | we get 


l —nQ;: 
ee 


(143) I, < @” 


S BR 


Combining (135), (138), (139), (142), (143) we obtain 


T6 ‘9 l 9 
i 1 — cos u 1/2 nn” 
Té (3 | ——du —-r)/<Q(n e+ — + =, a 
0 u n 2 &—1) @8/2k 
1¢ 
( 44) t-1 3/2(/—1) 
n 


nv 
+ Q: (a + e3/2k 


' | ‘ es 
Putting « = ces OCIEE) we get, as the last term in (144) is < Q,, 


Té 
| — cos u a 
T6 3 [ tia a r) <Q + Q.n' “{ — sic = im. = M,, 
0 u- nk (&-D/ (2k +3) nitk—-2) 


If4 <k < 6, we take 1 = k — 2 and get 


TS 
a 1 — cos u 1 
15(3 [1 8%" au — x) <Q + O(a +1) SO, 


Hence, by the argument following (70), 


lu.b.| Gv) — (vu) — x(u)| < Qe _ OX 


TT he 
2k(k — 1) 
2k + 3 


Té6 0S 7 
T5 (3 | . — : du — r) < Qi. + Qe (: + — < Qi. 
0 uU : Per are 


Hence 


giving (15). If k > 7, we take l = and get 


Qi 


eke 9()-4-2) 9 
mks 1) /2(k+3 


lub. | Gu) — &(u) — x(u) | < _ = 
giving (16). Therefore Theorem 4 is proved. 


5. When a — 1 — a3 =0. Ifa, — 1 — a3 = 0, then there is unit probability 
that £ assumes exactly two values: 


Prité; = a} = 2, Prté; = b} =e, Pp + ,= 1. 





DISTRIBUTIONS OF MEAN AND VARIANCE 29 


Let ¢; = 1 with probability p and ¢; = 0 with probability g. Then é = b + 
(a — b)t&i, n = (a — b)’ : =(¢; — &)*. Hence it is sufficient to consider the 


variable = >» & — &? = ». Letting Df; = r = np + Wnpgq Xwe have m = 


2 
r— f = npg + (¢ — p)V npg X — pgX. We now consider two distinct cases: 


Case (7). - ~q. Here 


F(z) = Pr ‘reer 2 < 3} 
p — 4\Vnpq 
= Pr{(X + eV/n)’ > en — 2\e|-V nz}, ¢ = a 
Thus F(z) = lifz >4\c|Vn. If z < 34|c| Vn, then 
F(z) = Pr{X < —cn — (en — 2|c|-Vnz)*} 
+ Pr{X > —eV/n + (Cn — 2) | Vnz)'} = Filz) + F2(2). 


To the random variable X Theorem 2 can be applied. Suppose that c < 0; 
then, by Tchebycheff’s inequality, 


F(z) < Pr{X > —en} < = < : 
(p — q)’n 


By Theorem 2, 
F,(z) = Pr{X < —cen — (?n — 2\c|+/nz)'} 


- @2? O(p? + 4q’) 
+ Talo—a* Van 





Hence 


p+ 4 2 1 \ 
145 | F(z) — &(z sagemepemrnnionins: diy, sueneomonontan y 
sia @) — #@)| s se + + Vnlp—a|* n@ —@ 
The same inequality holds also for c > 0. 
Case (ii). p=q=1/2. Here m = 3(n — X°);hence 


_ 1 . 9 
(146) Pr i > — = Pr{X*? < 2} = val ate dx + fe 


There is no asymptotic expansion for the distribution function of m. (See 
(C), p. 83.) 





SAMPLING INSPECTION PLANS FOR CONTINUOUS PRODUCTION 
WHICH INSURE A PRESCRIBED LIMIT ON THE OUTGOING QUALITY 


A. Wap AND J. WoLFrowITz 


Columbia University 


1. Introduction. This paper discusses several plans for sampling inspection of 
manufactured articles which are produced by a continuous production process, 
the plans being designed to insure that the long-run proportion of defectives 
shall not exceed a prescribed limit. The plans are applicable to articles which 
can be classified as ‘‘defective” or ‘“‘non-defective’”’ and which are submitted for 
inspection either continuously or in lots. In Section 2 the notions of “average 
outgoing quality limit’’ and ‘“‘local stability” are discussed. The valuable con- 
cept of average outgoing quality limit for lot inspection is due to Dodge and 
Romig [4], and that for inspection of continuous production to Dodge [1]. Sec- 
tion 3 contains a description of a simple inspection plan (SPA) applicable to 
to continuous production and a proof that the plan will insure a prescribed 
average outgoing quality limit. Section 4 contains a proof that this inspection 
plan also has the important property that it requires minimum inspection when 
the production process is in statistical control. In Section 5 is contained the 
description of a general class of plans which possess both these important proper- 
ties. 

The problem of adapting SPA to the case when the articles are submitted for 
inspection in lots instead of continuously, is treated in Section 6. Some methods 
of achieving local stability are discussed in Section 7 and a specific plan is devel- 
oped there. Finally Section 8 discusses the relationship between the present 
work and that of the earlier and very interesting paper of H. F. Dodge [1], 
mentioned above. 

If a quick first reading is desired the reader may omit the second half of Section 
3 (which contains a proof of the fact that SPA guarantees the prescribed average 
outgoing quality limit) and the entire Section 4 except for its title (the proof of 
the statement made in the title of Section 4 occupies the whole section). 


2. Fundamental notions. In this paper we shall deal only with a product 
whose units can be classified as “defective” or ‘“non-defective.”” We shall 
assume that the units of the product are submitted for inspection continuously, 
except in Section 6, where we assume that they are submitted in lots. Through- 
out the paper we shall assume that the inspection process is non-destructive, 
that it invariably classifies correctly the units examined, and that defective units, 
when found, are replaced by non-defectives. By the “quality” of a sequence of 
units is meant the proportion of defectives in the sequence as produced. By the 
“outgoing quality’? (OQ) of a sequence is meant the proportion of defectives 
after whatever inspection scheme which is in use has been applied. If this 
scheme involves random sampling, then in general the OQ is a chance variable. 

30 





SAMPLING INSPECTION PLANS 31 


(It depends on the variations of random sampling.) If the OQ converges to 
a constant p, with probability one as the number of units produced increases 
indefinitely, p. is called the ‘‘average outgoing quality” (AOQ). The AOQ 
when it exists is therefore the average quality, in the long run, of the production 
process after inspection. It is a function of both the production process and the 
inspection scheme. ‘These definitions are due to Dodge [1]. 

The “average outgoing quality limit”? (AOQL) is a number which is to depend 
only on the inspection scheme and not at all on the production process. Roughly 
speaking, it is 2 number, characteristic of an inspection scheme, such that no 
matter what the variations or eccentricities of the production process, the AOQ 
never exceeds it. For the purposes of this paper we shall need the following 
precise definition: Let c; be zero or one according as the ith unit of the product, 
before application of the inspection scheme, is a non-defective or a defective, 
respectively. Let d; have a similar definition after application of the inspection 
scheme. (We note that if the zth item was inspected, then d; = 0; if the ith 
item was not inspected, then c; = d;.) The sequence c = (1, ¢€2,°-+,¢w,°°° 
ad inf. characterizes the production process’. The elements of d = d,,d2,°--, 
ad inf. are in general chance variables. The number L is called the AOQL if it 
is the smallest? number with the property that the probability is zero that 


’ 


N 
2, di 


1 
N 


lim sup > &, 

N 
no matter what the sequence c. 

It should be noted that this definition of AOQL places no restrictions whatever 
on the production process, since all sequences c are admitted. It is too much 
to expect a production process to remain always in control; indeed, doubt as to 
whether statistical control always exists may cause a manufacturer to institute 
an inspection scheme. The inspection schemes which we shall give below will 
yield a specified AOQL no matter what the variations in production are. If 
these schemes are employed, then, even if Maxwell’s demon of gas theory fame 
were to transfer his activities to the production process, he would be unsuccessful 
in an effort to cause the AOQL to be exceeded. A dishonest manufacturer might 
sometimes essay to do this. If we imposed restrictions on the sequence c and 


1 This use of an infinite sequence to describe the production process deserves a few words. 
What we consider in this paper are schemes applicable when the number of units produced. 
is large and operate mathematically as if the production sequence were of infinite length. 
Naturally the latter is never the case in actuality. However, the larger the number of 
units produced the more nearly will the reality conform to the results derived from the 
mathematical model. While the present definition uses explicitly the notion of an infinite 
sequence, such a commonplace statement as ‘‘the probability is 1/2 that a coin will fall 
heads up’ uses this notion implicitly. It is also implicit in the intuitive meaning we ascribe 
to such a word as ‘“‘average,’’ which is in every day use. 

2 It is not difficult to see that such a number always exists, for it is the lower bound of a 
set which is non-empty (it contains the point one), bounded from below (zero is a lower 
bound), and closed. 





32 A. WALD AND J. WOLFOWITZ 


determined the AOQL on that basis, we would run the danger that the relative 
frequency of defects in the sequence of outgoing units might exceed the AOQL if 
it happened that the actual sequence c did not satisfy the restrictions imposed. 

After we discuss below various possible sampling inspection plans which 
insure that the AOQL does not exceed a predetermined value JL, it will be seen 
that for any given L > 0 there are many sampling inspection schemes which do 
this. To choose a particular sampling plan from among them the following 
considerations may be advanced: If two inspection plans S and S’ both insure 
the inequality AOQL < L and if for any sequence c the average number of 
inspections required by S is not greater than that required by S’ and if for some 
sequences c the average number of inspections required by S is actually smaller 
than that required by S’, then S may be considered, in general, a better inspec- 
tion plan than S’. However, the amount of inspection required by a sampling 
plan is not always th only criterion for the selection of a proper sampling 
scheme. There may be also other features of a sampling plan which make it 
more or less desirable. We shall mention here one such feature, called ‘‘local 
stability,’ which will play a role in our discussions later. Consider the sequence 
d obtained from the sequence c by applying a sampling inspection scheme. Even 
if the AOQL does not exceed L, it may still happen that there will be many large 
segments of the sequence d within which the relative frequency of ones is con- 
siderably higher than L. For instance, it may happen that in the segment 
(d,,---,@m) the relative frequency of ones is equal to 3Z, in the segment 
(din41,°** , dom) the relative frequency is equal to $L, in the segment (dom: , 
-++ , dsm) the relative frequency is again equal to $L, and this is followed again 
by a segment of m elements where the relative frequency of ones is equal to 3L, 
and so forth. If m is large, such a sequence d is not very desirable, since each 
second segment will contain too many defects. A sequence d is said to be not 
locally stable if there exists a large fixed integer m such that the relative frequency 
of ones in (di, +--+ , dem) is considerably greater than L for many integral 
values k. On the other hand, the sequence d is said to be locally stable if for 
any large m the relative frequency of ones in (dx41,°-- , dizm) is not substan- 
tially above L for nearly all integral values k. This is clearly not a precise 
definition of ‘local stability,” but merely an intuitive indication of what we want 
to understand by the term, since we did not define what we mean by “large m,”’ 
“many values of k,” “considerably above L,” etc. A precise definition of local 
stability will not be needed in this paper, since it is not our intention to develop 
a complete theory for the choice of the sampling plan. The idea of local stability 
will be used in this paper merely for making it plausible that some schemes we 
shall consider behave reasonably in this respect. A similar idea, called “‘protec- 
tion against spotty quality,’’ is discussed by Dodge [1]. A possible precise defini- 
tion of local stability could be given in terms of the frequency with which F(N) = 


] N+k 
eb 1) Dy Hi (h being fixed) lies within given limits. 
t=N 





oem= Ww oaoeeore VF 


ae er ft 


SAMPLING INSPECTION PLANS 33" 


3. A sampling inspection plan which insures a given AOQL no matter what 
the variations in the production process. The only feature of the sampling 
(inspection) plan (SP) studied in this section and hereafter referred to as SPA 
which we shall consider here is that it insures the achievement of a specified 
AOQL. Considerations leading to a choice among several schemes are postponed 
to later sections. 

For convenience, let f be the reciprocal of a positive integer. SPA calls for 
alternating partial inspection and complete inspection. Partial inspection 
is performed by inspecting one element chosen at random from each of successive 


1 . . : ‘ 
groups of — elements. Complete inspection means the inspection of every 


element in the order of production. SPA is completely defined when a rule 
is given for ending one kind of inspection and beginning the other. 

It is clear that all SP need not be of the above class, Thus, for example, a 
scheme might consist of partial inspection with various f’s employed in various 
sequences. We make no attempt in this paper to examine all possible schemes. 
For simplicity in practical operation, alternation of complete inspection and 
partial inspection with fixed f would seem reasonable. The Dodge scheme [1] 
is of this type. 

We shall also not discuss the question of a choice of the constant f, but will 
assume that a particular value has been chosen for various reasons and is a datum 
of our problem. Reasons which might influence a manufacturer in his choice 
of f could be contract specifications which impose a minimum on the amount of 
inspection, or psychological grounds to the same effect. The manufacturer 
may desire a certain minimum amount of inspection in order to detect mal- 
functioning of his production process. Also f controls local stability to some 
extent. The consequences of a choice of f as they appear in the theory below 
may also play a role. 

Returning to SPA, we begin with partial inspection. Let L be the specified 


1 
AOQL. Denote by ky the number of groups of f units in which defectives 


were found as the result of partial inspection from the beginning of production 
through the Nth unit. SPA is as follows: 

(a) Begin with partial inspection. 

(b) Begin full inspection whenever 


ke( 4 - 1) 
ew > L. 
(c) Resume partial inspection when 
en < L. 


(d) Repeat the procedure. (It will be recalled that defective units, when 
found, are always to be replaced with non-defectives.) 


en 





34 A. WALD AND J. WOLFOWITZ 


It is to be observed that in this plan the number of partial inspections increases 
without limit. For, while complete inspection is going on, the value of ky 
remains constant, so that after a long enough period of complete inspection the 
denominator N of the expression which defines ey will have increased sufficiently 
for ey to be not greater than L. On the other hand, complete inspection may 
never occur. This will be the case if, for example, no defectives or very few 
defectives are produced. 

We shall now show that the AOQL of the above SP is L. We first note that, 


(-) 


at N, ey can increase only by ~ a Hence, for sufficiently large N, ev < 


4 


L + ¢, where « > 0 may be arbitrarily small. 
Suppose now that the production process is subject to any variations whatso- 
ever, i.e., the sequence 
C = (1, C2,°°*, Cn, °*:, ad inf. 


is any arbitrary sequence whatever (by their definition the c; are all zero or one). 
Our result is therefore proved if we show that, with probability one, 
i< 
(3.1) lim (cs —- — 2. is) = (0 
No N t=1 
for this arbitrary c, and that for at least one c 
(3.2) lim ey = L. 


No 


1 
Let S(N) be the number of groups of = units which have been partially in- 


f 
spected through the Nth unit. Define x; as zero if in the ith partially inspected 
group a non-defective was found and as one if a defective was found. We have 
S(N) 
ky = >> Ha 


i=1 


Since the number of times partial inspection takes place increases indefinitely, 
S(N) —~ © asN—- oo. Also S(N) < fN < N. Let a; be the serial number 
of the last unit in the jth partially inspected group. Then for all j the expected 
value E(x;) of x; is given by 


E(z;) =4( > cs), 
i= (a j—(1/f)+1) 


aj 


(3.3) Zz (c; — di) = 2; 


i=(a@ j—(1/f)+1) 


1 | od 
= oe ont 
f , a j—( 


We have, for all 7 


so that 





SAMPLING INSPECTION PLANS 35 


Also from (3.3) it follows, since x; is the value of a binomial chance variable 


from a population of fixed number () , that there exists a positive constant 8 


such that 


(3.4) a ({5 - | Lj; - 7 is) <£B 
f @j—(1/f)+1 


where o°(x) is the variance of a chance variable x. Now a theorem of Kolmo- 
goroff (Kolmogoroff [2], Fréchet [3], p. 254) states: 

A sequence of chance variables with zero means and variances oj, o>, 
converges with probability one towards zero in the sense of Césaro if 


eo 


(3.5) | .s 


converges. ‘The inequality (3.4) permits us to apply this theorem to the se- 
quence of chance variables of which the jth (j = 1, 2, --- ad inf.) is 


(5-12-3084) 
—~-—Illa;- ra 
f : a ;—(7f) +1 


. wal, : . 
since the series ), = is well known to be convergent. We therefore obtain that, 


i=1 


with probability one, 


oe SW) yw SCV) 
since the units which are fully inspected contribute nothing to =d;. Since 
S(N) < N, the desired result (3.1) is a fortiori true. 

If c is such that all the c; are one, it is readily seen that (3.2) holds. If many 
(this adjective can be precisely defined) defectives are produced, this will also 
be the case. This completes the proof of the fact that the AOQL of SPA is L 
no matter how capriciously the production process may vary. 


t=1 


(G-1]ee-Be) fe 2 


ria) eo 


4. When the production process is in statistical control, SPA requires minimum 
inspection. The production process is said to be in statistical control if there 
is a positive constant p < 1 such that, for every 7, the probability that c; = 1 
is p and is independent of the values taken by the other c’s. We shall see that 
if the process is in statistical control and if SPA is applied to it, the specified 
AOQL is guaranteed with a minimum amount of inspection. 

The number of units inspected through the Nth unit produced is 


(4.1) I(N) =N— ( si 1)8(N). 


If the process is in statistical control we have, with probability one, 





A. WALD AND J. WOLFOWITZ 


N 
| dy ci 
42 in 
a no N 
by the strong law of large numbers. Shortly we shall prove the existence of a 
constant Z* such that, with probability one, 
N 
° d; 
(4.3) lim — = L*., 
no N 
Assume for the moment that this is so. Since it is only by inspection that de- 
fectives are removed, and the units selected for inspection are in statistical con- 
trol like the original sequence, it follows that, with probability one, 
. I(N L* 
(4.4) m iN) = (p — L*) = 1 -— 
N p p 


because, with probability one, 


Inspection is therefore at a minimum when L* is at a maximum compatible 
with the specified AOQL. By (4.3) the latter means that 


(4.5) L* < L. 


SPA has been shown to guarantee this requirement. The optimum situation 
from the point of view of the amount of inspection would therefore be to have 
L* = L, but this cannot always be achieved. The absolute minimum amount 
of inspection clearly is f, i.e., partial inspection exclusively. Consequently 
from (4.4) 


L~=. > 
= 2s 


so that 
(4.6) i* = gl — f). 


Combining (4.5) and (4.6) we see that we have to consider three cases: 
Casea. If . 


L 


(4 7) or ey 


we have to show that 
(4.8) 
Case b. If 


(4.9) 





SAMPLING INSPECTION PLANS 


we have to show, by (4.4), that 


that is, 
(4.10) 
Case c. If 


(4.11) 


we have to show that 
(4.12) L = L* = p(l — f). 


Proor of (4.8): We have already remarked in Section 3 that in SPA partial 
inspection always recurs, but complete inspection need never occur. We shall 
show in a moment that (4.7) implies that no matter how large an integer + 
is chosen, the probability of temporarily stopping partial inspection for some 
N > vy is one. Assume that this is so. Choose an arbitrarily small positive 


(= 1) 


€ 
alternate infinitely many times let 


e, and let y > For a sequence where complete and partial inspection 


A a, @,-**:, ad int. 
be the sequence of integers at which partial inspection ends, and let 
B = 8, B,°-: , ad inf. 
be the sequence of integers at which complete inspection ends. Then, for all j, 
Ajit > B; > aj. 


From the description of SPA it follows that, for all V > y which belong to either 
A or B, 


(4.13) lee ~Li<e 


In Section 3 we proved 


N 
(3.1) im (ex - FD as) = 0 
iV i=l 


N-o2 


with probability one. Since e¢ is arbitrarily small it follows that, with probability 
one, 


1 im = L, 
— 2 


(N in A or B) 





38 A. WALD AND J. WOLFOWITZ 


To complete the proof of (4.8) we have still to show that L* exists and that the 
probability is one that complete inspection will occur infinitely many times. 
First we prove that L* exists. 


N 
As N increases during an interval of complete inspection, D(N) = >> d; 
i=l 


remains constant. Hence — 
of such intervals (4.14) holds, it follows that (4.14) holds as N — ~ and isa 
member of A, B, or an interval (a;, 8;) for all j. 

Let N — « while always being in the interior of an interval (6;, ajii], 7 = 
1, 2, --- , ad inf., which contains a;,; but not 8;. Let V* be the total number 
of units in these intervals through the Nth unit produced. Let N; and N2 be 
such that 


decreases monotonically. Since for the ends 


B; = N, 4 No < Oj+1. 
Then 


* +* yr y 
No —_ Ni = No —_ Ni. 


~ 


Since the production process is in statistical control, we have, by the strong 
law of large numbers, 


. DIN ; 
(4.15) lim cl = p(il—f)=p 


with probability one. Let 6* be the general designation for numbers <e in 
absolute value, so that all 6* are not the same. With probability one for almost 
all N, we have by (4.15) 
D(Ni) 

=p 


Ni 


Write 


[D(N:) — DW] _ x 
(Ne — Ni) 
Now 
D(N2) _ D(Ni) + [D(N2) — D(Ns)] _ D(Ni) + [D(N2) — D(N))) 
Ns jNi+(N2—-N) | Ni + (N2 — Mi) 
| _ (pt + O*)NE + K(N2— Ni) __, 
. Ne + (2 — Ni “9 +. 


K(Ne — Ni) = 26*Nt + (p’ + 8*)(N2 — N)). 





SAMPLING INSPECTION PLANS 39 


Now suppose (4.3) does not hold. From the definition of AOQL it follows that 
for some n > e there exist sequences (whose totality has a positive probability) 
so that, for infinitely many Ne we have 


(4.17) D(N2) _ D(N1) + [D(N2) — D(Ni)| 2 


N2 Ni + (N2 — Mi) 
For large enough N,, from (4.14), 


with probability one and hence, using (4.16) in (4.17) 
Ny(L + 8) + 28*Ny + (p’ + 6*)(N2 — Mi) 
< LN: + L(N2 — M1) — 4nN2 
from which, using the fact that p’ > L (from (4.7)), we get 
(4.19) Nis* + 2Ni8* + 8*(N2 — Ni) < —4nN2. 


((4.18) and (4.19) hold for the sequences for which (4.17) holds, except perhaps 
on a set of sequences whose probability is zero.) Since Nj < Ni and | 6*| < 7, 
we have, on the other hand, 


(4.18) 


Nio* + 2NTS* + 8*(N2 — Ny) > —3nNi — 2(N2 — Nd) 
> —4nNi — 4n(N2 — Ni) = —4nN2 


(4.20) 


which contradicts (4.19) and proves the desired result ((4.3) and (4.8)), except 
that it remains to prove that, no matter how large y, the probability of tempo- 
rarily stopping partial inspection at some N > y is one. Let yo > y be some 
integer at which partial inspection is going on. From (4.2) and (4.7) it would 
follow, if partial inspection never ceased on a set of sequences with positive 
probability, that, on this set, with conditional probability one, for N sufficiently 
large and e sufficiently small, 


ae 
asa" t3 
N kel —f) 
N- Yo {N 
«eae + WW =e, 


a 


+ ¢, 


> L + (1 — fie, 


en > L+ =m 


This contradiction proves that complete inspection is eventually resumed and 
completes the proof of minimum inspection in Case a. 





40 A. WALD AND J. WOLFOWITZ 


Proor of (4.10): We shall prove that (4.9) implies that, with probability one, 
complete inspection will cease, never to be resumed. For, from (4.15) and 
(4.9) it follows that for N sufficiently large and ¢ sufficiently small, 

D(N) _ 


N* 


(4.21) p +s <b — 2X. 


Hence, a fortiori, 

D(N) 
N 
((4.21) and (4.22) hold with probability one.) 

(3.1) states that, with probability one, 


im (or — 22) = 0 


(4.22) <L— 2. 


N—@ N 
Hence for all N sufficiently large, with probability one, 
eén < L- é, 


i.e., with probability one complete inspection is never resumed. 
When (4.9) holds, therefore, with probability one and with a finite number of 
exceptions SPA will require only partial inspection. 


ProoF of (4.12): If p = a 7m complete inspection finally never resumes, 


then (4.12) follows easily. If p = and partial and complete inspection 


L 
;=%s 
alternate infinitely many times, then the proof is similar to that of (4.8) and is 
therefore omitted. In either case the desired result follows. 


5. A class of SP all of which insure both a given AOQL and minimum inspec- 
tion. Let the definition of SPA be modified in the following particulars: 
(b) Begin full inspection whenever 


= 2‘ > 1+). 


(c) Resume partial inspection when 
ex < L — YN). 
Let @(N) and ¥(N) be such that 
—¥V(N) < o) 
lim #(N) = lim y(N) = 0. 


€n 


(SPA corresponds to the case ¢(N) = ¥(N) = 0.) Then all the SP of this class 
have the property that the AOQL is LZ and that inspection is at a minimum in 





SAMPLING INSPECTION PLANS 41 


the sense of Section 4. The proofs are essentially the same as those for SPA 
and hence will be omitted. 


6. The inspection plans of Section 5 can also be applied to lot inspection. 
We shall carry on the discussion of this section in terms of SPA, but the results 
apply to all the members of the class of plans described in Section 5. We shall 
show that SPA can also be applied when the product is submitted for inspection 
in lots. Although we assumed previously that the units of the product are 
arranged in order of production, the results obtained for SPA remain valid for 
any arbitrary arrangement of the units. If the product is submitted in lots we 
may arrange the units as follows: Let l,l, --- , ete. be the successive lots in 
the order of their submission for inspection. Within each lot we consider the 
units arranged in the order in which they are chosen for inspection. In this way 
we have arranged all units in an ordered sequence and the inspection can be 
applied as described before. Thus, we start with partial inspection, i.e., we 


1 ‘ : 
take out groups of 7 Gomente in 1, and inspect one unit (selected at random) 


from each of these groups. When ey > L, we start complete inspection and 
revert to partial inspection as soon as ey < L. When the units in J; are used 
up in the process of inspection, we continue, using the units of I, , ete. 


1 
If it is found inconvenient to take out a group of = units and then to select 


f 


one unit for inspection, we could modify the sampling inspection plan as follows: 
1 ' . 
Instead of taking out a group of 7 and then selecting at random one unit 


from it, we select at random one unit from the uninspected part of the lot and 
look upon this unit as the unit selected at random from a hypothetical group of 


] 
f units. Thus we can proceed exactly as before, except that we have to keep in 


mind that with each unit inspected under ‘‘partial inspection”? we have used 


1 1 
up another set of j — lunits. Thus, as soon as F — 1) times the number of 


units inspected under ‘‘partial inspection”? becomes equal to or greater than the 
number of units in the uninspected part of the lot, the inspection of that lot is 
already terminated, and we have to start using the units of the next lot. The 
inconvenience caused by the necessity of keeping track of the number of units 
inspected under “partial inspection” and of the number of units in the unin- 
spected part of the lot can be eliminated by further modifying the inspection 
plan as follows: Instead of beginning complete inspection as soon as ey > L, 
we continue ‘‘partial inspection” until Ey = ey — L is so large that complete 
inspection of all the units of the lot not yet used up has to be made in order to 
bring ey down to L at the end of the lot. This leads to the following sampling 
procedure, to be known as SPB: Let No be the number of units in the lot, let 
N, be the serial number of the last unit in the preceding lot, and let E(Nz) = 





42 A. WALD AND J. WOLFOWITZ 


NiEv, = Nien, — L) be the “excess” carried over from the preceding lot. 
For simplicity assume that the following are all integers: 


IN» = M 


Sl le 
(y** 
{No = N* 


JE(N«) _ ype 
i-f " 
The inspection procedure is then as follows: Inspect successive units drawn at 
random until either 
(a) M* — E* defectives have been found in the first V’ < N* units inspected. 


‘ ‘ - ws ; : 
In this case inspect further an additional No — — units and this terminates the 


f 


inspection of the lot. The excess to be carried over to the next lot is then zero. 
Or 
(b) N* units have been inspected and the number of defectives found is H < 
M* — E*. In this case the inspection of the lot is terminated and the present 
negative excess 


E(N1 + No) = [H — (M* — E*) Mee 


is carried over to the next lot. (The serial number of the last element in the 
present lot is Nz, + No and 
(1 — f) 


N1 €nz, + H — 


E(NL+No) = — a : 


Hence the present excess is 


: 1 —f) . . 
(Ni + No) lew p+%5) _— L| = Niey, — H (1 ma —_ LN = LN 9 


= Nifey, — L) + H oar a 


‘ a [H — M* + E*], 


as given above.) 
We note an important property of SPB: The excess carried over from a pre- 
ceding lot is never positive. 





SAMPLING INSPECTION PLANS 43 


7. Possible modifications of the SP to achieve local stability. Although 
the sampling plans discussed in previous sections are optimum in the sense that 
they guarantee the desired AOQL with a minimum of inspection when the 
production process is in statistical control, they do not always behave very 
favorably as far as local stability is concerned. To make this point clear, 
consider the following example: Suppose that during a very long initial time 
period the production process functions very well and the relative frequency 
of defectives produced is well below L. Thus, applying SPA, say, ey — L will 
be considerably less than zero at the end of this period. Now suppose that then 
the production process suddenly deteriorates and the number of defectives 
produced during the next period of time is considerably higher than L. In spite 
of that, complete inspection will not begin for quite some time because ey became 
so small during the initial period. Thus there will be a long segment in the se- 
quence of outgoing units within which the relative frequency of defectives will 
be larger than the prescribed AOQL. Of course, this segment will be counter- 
balanced by other segments where the relative frequency of defectives will be 
below the AOQL, so that the AOQL will not be violated. Nevertheless, the 
occurrence of long segments with too many defectives, i.e., a lack of local sta- 
bility, is not desirable. 

It should be noted that, even though SPA was not designed to achieve con- 
siderable local stability, drastic lack of local stability cannot occur when the 
production process is in statistical control and SPA isemployed. In the example 
given above where the outgoing quality was not locally stable, it was assumed 
that there were variations in the production process. The existence of statistical 
control acts as an important stabilizing factor on the quality. 

In this section we want to discuss several possible modifications of SPA which 
will insure a greater degree of local stability. One such modification is the 
following: We choose a positive constant A and we define the excess E x for each 
value N as follows: E*(N) is equal to the excess E(N) as originally defined 
(= Nliey — L]) as long as for all N’ < N, E(N’) > —A. The dif- 
ference E*(N + 1) — E*(N) = E(N + 1) — EVN) for all N for 
which E(N + 1) — E(N) > 0. If E(N + 1) — E(N) <0, then E*(N + 1) = 
max [E*(N) + {E(N + 1) — E(N)}, —A]. In other words, with this modifica- 
tion of the sampling inspection plan we set a lower bound —A for the excess. 
When the excess is positive we begin complete inspection, and revert to partial 
inspection when the excess becomes non-positive. The effect of this is that, if 
the proportion of defectives produced becomes large, complete inspection will 
not be delayed very long, although the proportion of defectives produced in the 
preceding period may have been considerably below LZ. It is clear that this 
modification of SPA does not increase the AOQL. However, the amount of 
inspection will be somewhat increased, especially when the quality of the product 
is less than or only slightly greater than L. If the constant A is large, the in- 
crease in the amount of inspection is only slight, but also the degree of local 
stability achieved is not very high. On the other hand, if A is small, the increase 





44 A. WALD AND J. WOLFOWITzZ 


in the amount of inspection may be considerable, but a high degree of local 
stability is achieved. Thus, the choice of A should be made so that a proper 
balance between local stability and amount of inspection is achieved. 
Modifying SPA by setting a lower limit for the excess has the disadvantage 
that the mathematical treatment of this case is involved. We shall, therefore, 
consider another modification of the inspection plan which will have largely the 
same effect, but whose mathematical treatment appears to be much simpler. A 
fixed positive integer No is chosen and the inspection scheme is designed so that 
Ey, < Ois assured. If Ey, is negative, we replace it by zero. In other words, 
no excess is carried over from the first segment of No units to the next segment of 
No units. Thus, the second segment of No units is treated exactly the same way 
as if it were the first segment, and this is repeated for each consecutive segment 
of No units. This modification of SPA (the resulting plan is to be known as 
SPC) has essentially the same effect as setting a lower bound for the excess. 
Again it is clear that by this modification the AOQL is not increased, but the 
amount of inspection may be increased. ‘The latter is particularly true when 
No is small, which corresponds to very high local stability requirements. More 
efficient plans than SPC can probably be devised for this situation. 
Undoubtedly, there are many other possible modifications of the inspection 
plan by which a greater degree of local stability can be achieved at the price of 
somewhat increased inspection. It is not the purpose of this paper to enumerate 
all these possibilities or to develop a theory as to which of them may be con- 
sidered an optimum procedure. We shall restrict ourselves to a discussion of the 
mathematical consequences of SPC. First we define it precisely. If it is to be 
applied to inspection of lots of size No then SPC is simply SPB with E(N,z) 
and E* always zero. When applied to continuous production it will operate 


as follows: Assume for convenience that M = LN, , N* = fNo, 


are all integers. 
(a) Begin each segment of No units with partial inspection, i.e., inspect one 


‘ . i — . 
unit chosen at random from each successive group of j units. Continue partial 


inspection until one of the following events occurs: either 

(b) M* defectives are found. In this case begin complete inspection with the 
first unit which follows the group in which the last of the M/* defectives was 
found and continue until the end of the segment of .Vo units. 
or 


(b’) N* groups of 7 units are partially inspected. 


(c) Repeat with the next segment of No units. 

Comparison with SPB shows that, in SPC, if (b) occurs earlier or at the same 
time as (b’), then Ey, = 0, while if (b’) occurs before (b) we have Ey, < 0. 
In contradistinction to SPB, in SPC there is no carrying over of the excess. 

Let us determine the AOQ for SPC when the production process is in a state 





SAMPLING INSPECTION PLANS 45 


of statistical control. Denote by p the probability that a unit produced will be 
defective. Let the chance variable H denote the number of defectives found 
during partial inspection. The probability that H = i < M* is 


wks i 
H < M* always. We have, when H = 
E(No) = a — LNo, 


and hence 


Noew, _ (1 5. 


T ~#) 


'N, 


therefore 


C2 [ ae - Far - 9 ("") ea - | 


-~ =] 1 ~ > a (M* -0(4 ") pa - pr. 


The reduction from the original quality p to the AOQ was achieved by inspecting 


(7.1) 


a fraction of units which is ; times the reduction in the frequency of defectives. 


Hence, with probability one, the fraction of units inspected when the production 
process is in statistical control is 


Ly M*—1 ‘ ‘iets 


When p > i L j we see from Section 4 that the third term of the right member 


of (7.2) represents the price paid in fraction of inspection above the minimum in 


return for the local stability achieved. When p < iy the additional inspec- 


tion is of course J — f. 

As No» becomes larger, SPC becomes more and more like SPA, and conse- 
quently the amount of inspection tends to the minimum. As No becomes 
smaller, the degree of local stability achieved becomes higher and must be 
paid for by an increasing amount of inspection. An illustrative example will be 
given in the next section. It has already been pointed out that the mere exist- 


ence of statistical control implies a considerable amount of local stability even 
when SPA is applied. 





46 A. WALD AND J. WOLFOWITZ 


The only practical difficulty which may arise in evaluating the formulas in 
(7.1) and (7.2) might come from attempting to evaluate 


M*—1 : 
a * (M* — 1 ‘) ea - p)* 
For those values of the parameters which are likely to occur in application, a 
good approximation to 7’ (exactly how good we shall not investigate here) is 
given by 


M*—1 “i 
‘ "?(N*p)" 
M* — i) ~~ 
-“- ( a 


A table of T for integral values of M* from 2 to 16 and for integral values of N*p 
from 1 to 25 is given below. The computations were performed under the 
direction of Mr. Mortimer Spiegelman of the Metropolitan Life Insurance 
Company, to whom the authors are deeply obliged. 


M*—1 
= . << N*p(N*p)* 
Table of T = 2d <9 = 





8 








| 1.10) .54 .25]  .11/ .05) .02| . 00; . 
| 2.02! 1.22) .67| .35|  .17]/  .os| . .02| . ; .00 | .00 
3.00| 2.08} 1.32} .78| .44/ .23] . 06) . ‘ .01 | .00 
4. 00} 3.02) 2.13) 1. .88| .52) . mee 4 ; .02| .01 
5.00) 4.01} 3.05} 2.20) 1.49] .96| .5¢ 35]. : .06 | .03 
6. 00} 5.00) 4.02} 3.08} 2.26) 1.57] 1. .66 | . . .14| .08 
7.00} 6.00} 5.01) 4.03} 3.12| 2.31) 1. mi lw. mi 
. 00} 7.00} 6.00} 5.01} 4.05} 3.16] 2.37 | 1.71 .51 | .32 
| 
| 
| 


1 
2 
3 
f 
5 
6 
é 


9.00; 8.00) 7.00 .00} 5.02) 4.08) 3. 2.43 85 56 
10.00} 9.00} 8.00 .00} 6.01) 5.03} 4. 3.24 | 1.31 91 
11.00} 10.00; 9.00 .00; 7.00) 6.01) 5. 4.13 1.89 | 1.37 
12.00} 11.00} 10.00} 9.00} 8.00) 7.01) 6. 5.07 2.58 | 1.95 
13.00} 12.00} 11.00) 10.00; 9.00} 8.00) 7. 6.03 3.36 | 2.63 
| 14. 00} 13.00} 12.00) 11.00} 10.00) 9.00} 8. CO 4.22 | 3.40 
15.00} 14.00} 13.00) 12.00} 11.00} 10.00) 9. 8.01 5.12 | 4.25 

















Noor WN 











8. The SP of H. F. Dodge. H: F. Dodge [1] has proposed a very interesting 
SP for continuous production. The plan is defined by two constants 7 and f 
and may be described as follows: Begin with complete inspection of the units 
consecutively as produced and continue such inspection until 7 units in succes- 
sion are found non-defective. Thereafter inspect a fraction f of the units. 
Continue partial inspection until a defect is found. Then start complete inspec- 
tion again and continue until 7 units in succession are found non-defective. 
Repeat the procedure. 

Dodge [1] derived formulas for determining the AOQL corresponding to any 





SAMPLING INSPECTION PLANS 47 


pair 2 and f, under the assumption that the production process is in a state of 
statistical control. Dodge’s formulas for the AOQL are not necessarily valid 
if we do not make this restriction on the production process, i.e., if we admit 
that the probability p that a unit will be defective may vary in any arbitrary 
way during the production process. This, of course, is not a criticism of the 
derivation of the formulas; it cannot be considered surprising that a formula is 
not valid under assumptions different from those under which it was derived. 
However, it is relevant to point out the fact that the Dodge SP does not guaran- 
tee the AOQL under all circumstances, so that care must be taken to ensure that 
certain requirements are met. Exactly what these requirements are is not 
known; statistical control is a sufficient condition, but is probably not necessary 
and could be weakened. It seems likely to the authors that, if p varies only 
slowly (with N) with infrequent ‘‘jumps,” the Dodge SP will produce results 
which will exceed the AOQL by little, if at all. But if the “jumps” are numer- 

, M*%—1 r . e-N* P(N *p)i 

Table of T = > (M* — i) — 

(Continued) 


N*p 


| 
~ 





19 20 


.00 | .00 
00 | . j .00 | . mi. ; .00 | .00 | .00 
00 | . .00 | .00 | .00 | . i .00 |. .00 | .00 | .00 | .00 
.01 | .00 | .00 | . 00 | . .00 | .00 |. .00 | .00 | .00 | .00 
wet .00 | .00 | .00 | . .00 | .00 |. .00 | .00 | .00 | .00 
04 | .02 | .01 | .01 | .00 | .00 | .00 | .00 |. .00 | .00 | .00 
.10 | .05 | .03 | .02 | .01 | .00 | .00 | .00)|. .00 | .00 | .00 | .00 
} .12 | .07 | .04 | .02 | . 01 | .00 |. .00 | .00 | .00 | .00 
Si 14 | .08 | .05] . 01 | .01 |. .00 | .00 | .00 | .00 
10 61 |. .26 | .16 | .10 |] . 03 | .02 |. .01 | .00 | .00 | .00 
11 | .97 | .66 | .44 | .29 | .18 | .11 | .07 | .04 | .02} .01 | .01 | .00 | .00 


12 |1.43 |1.02 | .71 | .48 | .32 | .20 | .13 | .08 | .05| .03| .02]| .01 | .O1 
13 |2.00 |1.48 |1.07 | .75 | .52 | .35 | .23 | .15 | .09| .06| .03 | .02| .01 
14 (2.68 [2.05 [1.54 |1.12 | .80 | .55 | .38 | .25 | .16| .10| .07| .04| . 


| 2 |2.10 11.59 |1.17 | .84 | .50 | .41 | .27| .18| .12| .07 | .05 


| 
' 


1 
2 
3 
4 
5 
6 
t 
8 
9 



































15 3.44 |% 


ous and appropriately spaced it is possible to exceed the AOQL by substantial 
amounts, as the example below will show. The Dodge plan was intended to 
serve as an aid to the detection and correction of malfunctioning of the produc- 
tion process and this use would tend to prevent the occurrence of such a phenome- 
non. Parenthetically, it should be remarked that the information obtained in 
the course of inspection according to either the plans discussed in this paper or 
any reasonable scheme should, if possible, be sent at once to the producing 
divisions for their guidance. 

An example to show that the AOQL can be exceeded can be constructed as 





48 A. WALD AND J. WOLFOWITZ 


follows: Let 7 = 54 and f = 0.1. Then according to the graphs of [1], page 
272, the AOQL should be 0.02. Define a sequence of 60 successive units free 
of defectives as a segment of type 1, and a sequence of 60 successive units where 
the production process is in statistical control with p = 0.1, as a segment of type 
2. Suppose that the sequence of units produced consists of segments of types 
1 and 2 always alternating. Then it follows that the first item inspected in a 
segment of type 2 is always inspected on a partial inspection basis. We now 
assume that, unless the occurrence of a defective has previously terminated 
partial inspection, the Ist, 11th, 21st, 31st, 41st, and 51st items in a segment 
of type 2 will be chosen for partial inspection, and if the 1st item is found defec- 
tive, the entire segment of type 2 will be cleared of defectives. (Both of these 
assumptions favor the Dodge SP.) Then the situation is as described in the 


following table: 
(1) (2) (3) 
Expected number of defec- 
tives remaining in seg- 

Probability of first ment of type 2 after 
terminating partial partial inspection 

inspection at has been ter- 

each item minated (1) x (2) 


1st ‘ 0 0 
1ith i A=. . .O81 
21st ‘ : .081 ‘ . 1458 
31st ; . .0729 : . 19683 
41st ‘ . .06561 j . 236196 
dlst ‘ ; .059049 ‘ . 2657205 


Expected number of defectives 
Probability that an entire left in a segment of type 2 
segment of type 2 will which has been inspected 
be partially inspected only partially Product 


(.9)® = .531441 5.4 2.8697814 


Sum = 3.7953279 


3 .7953279 


The AOQQ is theref 
e AOQ is therefore 120 


= .0316+, while LZ = .02. 


It is therefore difficult to compare the Dodge plan with any of the plans de- 
scribed in this paper with respect to their effect on a production process not in 
statistical control. If the production process is in statistical control, then, as we 
have already seen, SPA requires minimum inspection (and, incidentally, because 
of the existence of statistical control, produces a fair degree of local stability). 
If, when statistical control exists, one requires both maintenance of a given 
AOQL and a higher degree of local stability than is produced by SPA, the rele- 
vant comparison is between the Dodge plan and SPC. Both will probably give 
good results as regards local stability, but it is not possible at present to make 





SAMPLING INSPECTION PLANS 49 


these intuitive notions precise, as we have not given an exact definition of local 
stability. The following example (in which statistical control is assumed) may 
not be unrepresentative of what the situation is with regard to the amount of 
inspection required. 


Fraction of product inspected under the Dodge plan and under SPC when 
L = .045 j=.1 


| Fraction of product | Fraction of product inspected under SPC when 
inspected under the 


Dodge plan No= 400 | No=1000 | No = 2000 








01 .12 12 .10 .10 
.02 15 17 ll .10 
.03 | .19 22 14 ll 
.04 | .23 .28 19 15 
05 .28 .34 26 21 


| 33 40 33 29 
07 .39 45 .39 37 
08 | 45 .50 46 44 
09 52 5A 51 .50 
10 58 | BY 55 55 














The decrease in inspection required by SPC as Np increases is evident in this 
table. When Ny = 2000 SPC requires less inspection than the Dodge plan, 
when Ny = 400 it requires more inspection than the Dodge plan. How the 
various degrees of local stability achieved compare remains an open question. 
The case when No = 400 probably lies in the region where SPC is inefficient 
(as regards amount of inspection) and corresponds to a high degree of local 
stability. 

We note that both plans call for increased inspection as the quality worsens 
(p increases). If the manufacturer is required to pay for the inspection this 
serves as an added incentive to improve quality of output. 


REFERENCES 


[1] H. F. Dopce, Annals of Math. Stat., 14 (1943), p. 264. 

[2] A. Kotmocororr, Comptes Rendus Acad. Sciences, 191 (1930), p. 910. 

[3] Maurice Fr&cHET, Généralités sur les Probabilités. Variables aléatoires. Paris, 1937. 
[4] H. F. Dopce anp H. G. Romie, Bell System Tech. Journal, 20 (1941), p. 1. 





THE EXPECTED VALUE AND VARIANCE OF THE RECIPROCAL AND 
OTHER NEGATIVE POWERS OF A POSITIVE BERNOULLIAN 
VARIATE’ 


By Freperick F. STePHAN 


War Production Board, Washington 


1. Introduction. The expected value of the reciprocal of a Bernoullian 
variate appears in certain problems of random sampling wherein both practical 
considerations and mathematical necessity make zero an inadmissible value 
of the variate. This special condition excluding zero is necessary from a practical 
standpoint because statistics can not be calculated from an empty class. It isa 
necessary condition, in the mathematical sense, for the expected value, and 
variances involving it, to be finite. When subject to this condition the Bernoul- 
lian variate will be designated the positive Bernoullian variate. 

There appears to be no simple expression for the expected value of the recip- 
rocal such as there is for the expected value of positive integral powers of the 
positive Bernoullian variate. This paper presents in (15) a factorial series, 
which can be computed conveniently to any desired number of terms by means 
of the recursion relation (18). Upper and lower bounds on the remainder may 
be computed readily from (20), (21), (23), (24), and (26) and the approximation 
may be improved by adding an estimate of the remainder taken between these 
bounds. A factorial series for the expected value of negative integral powers 
is given in (34). A factorial series for the expected value of the reciprocal of the 
positive hypergeometric variate is given in (53). Series for the variances follow 
directly from the series for expected values. 

A simple example of the sampling problems in which this expected value 
appears is presented by the following instance of estimates derived from samples 
of variable size: 

An infinite population consists of items of two kinds or classes, A and B. 
Lots of N items each are drawn at random. In such lots the number of items, 
x’, that are of class A is an ordinary Bernoullian variate. Next, every lot 
composed entirely of items of class B is discarded. This excludes all lots for 
which x’ = 0. From each remaining lot the N — 2’ items of class B are set 
aside, leaving a sample composed entirely of items of class A. The number of 
such items, x, varies from sample to sample. It will be designated a positive 
Bernoullian variate since x = 2’ if 2’ > 0 and x does not exist if #’ << 0. Finally, 
let there be associated with each item in class A a particular value of a variable, 
y; the variance of which in A iso. Then if the mean value of y is computed for 
each sample, the error variance of such means is E(o°/x) = o E(1/z). 

Instances similar to that just described occur in the design of sampling surveys 
from which statistics are to be obtained separately for each of several classes 


1Developed from a section of a paper presented to the Washington meeting of 
the Institute of Mathematical Statistics on June 18, 1943. 


50 





BERNOULLIAN VARIATE 51 


of the population, i.e., each statistic is to be computed from some part of the 
sample instead of all of it. They also occur in certain sampling problems in 
which some of the items drawn for a sample turn out to be blanks. 

A related problem concerning the error variance of the proportion of males 
among infants born in any one year was considered by G. Bohlmann in a paper 
on approximations to the expected value and standard error of a function [1]. 
His approach to the problem was to expand the function in a Taylor series and 
take the expected value of each term. The conditions under which the resulting 
series converges were developed for certain functions of a Bernoullian variate. 
The present paper provides a different and, in certain respects, superior approach 
to the problem employing a method due to Stirling [2]. While the method is 
applied to the reciprocal and negative powers it is also applicable to certain 
other functions of a Bernoullian variate. 


2. The positive Bernoullian variate. Let + be a random variate defined by a 


Bernoullian probability function subject to the special condition x > 0. The 
probability of x in n is 


(1) P(x) = (") f¢°/Q-¢) 


where x and n are integers, 1 < x < n, and 


n n! 
2) (*) ~ al — a)" 


The probabilities p and q are constants,0 < p=1-—q <1. 

The divisor 1 — gq” arises from the condition excluding zero. (Bohlmann 
omits this factor, assuming that q” is negligible, an assumption that is not 
always valid. In fact, g”" ~ e”"”.) An extension of this condition to exclude 
all values of x less than a specified constant will be considered in a later section. 

Throughout this paper summation is understood to be from « = 1 tox = n 
unless it is shown otherwise. 


3. Expected values and moments. The expected values of x and its positive 
integral powers are 


(3) E(x) = np/(l — q’) 
(4) E(x*) = (npq + n’p’)/(1 — q”) 


and, in general 
° . j 
6 BG) = wA- a= Leal") P_, sro 
; ~ 


where v; is the ith moment about zero of an ordinary Bernoullian variate with 


the same n and p and the Si are the Stirling numbers of the second kind (see 
Table 1). 


The moments about E(x) are somewhat more complicated than the corre- 





52 FREDERICK F. STEPHAN 


sponding moments of the ordinary Bernoullian variate. For example, the 
variance 


2 3 4 
_ (a) = PY eg 
and the third moment 
2.2 2a+!1 


7 = 3n' pd npg” (1 + 4") 
7) E{(e — E(z))} = GP) era _ 3 See? 
> eh «SS «+7 a ae 
The moments about np, the first moment of an ordinary Bernoullian variate, 
are 
(8) E{(x — np)"} = (ui + (-1)'*(mp)‘q")/( — 9") 


TABLE 1 
Stirling numbers of the second kind, Si 


2 3 





L 
/ 





2 

90 | 5 15 
301 | 140 | 21 
966 1,050 266 
255 3,025 | % 6,951 | 2,646 
| 511 9,330 | 42,525 | 22,827 





fee 


1 
2 
3 
4 
5 
6 
7 
8 
9 
0 


1 

















where y; is the ith moment, about the mean, of an ordinary Bernoullian variate 
with the same values n and p. 


The expected value of the reciprocal is 
1\ _ 1 1 n-1, 11 _ 2 n-2 
z (2) "i=? ‘ npq + 5 5 en pq . 
1 7 n-i 1 n 
(9) $e th (") ofa tet beth, 


a 
This equation is not suitable for the computation of E(1/zx) to a satisfactory 
degree of approximation unless np is small, say less than 5 for most purposes. 
The number of terms necessary to obtain a computed value with four significant 
figures, for example, may be estimated to be approximately 8+/npq/( — q"). 
Expressed as a function of g, E(1/x) becomes 


1 1 g'—-g¢ 
E(-)= == 
(10) (:) i-g a-s+ 1 
a series which may be convenient for small values of gq. 
E(1/x) may be expanded in a power series by Taylor’s Theorem. It may 








BERNOULLIAN VARIATE 53 


also be expanded in a finite series of expected values of powers, either in E(x), 
7 2 : , 2 *. °,° r 
E(a°), --- or in E(x — c), E(x — ec)’, --- ¢ being any positive constant. The 


t 
second of these three series may be obtained by expanding , _ 4 and taking 
c 


expected values, and the third by dividing out t . 7 G <) and taking ex- 


pected values. For all three expansions, however, ‘the terms become progres- 
sively more complicated and laborious to compute. A simpler and more con- 
venient series for actual computations may be obtained by expanding 1/z in a 
factorial series. 

4. Expansion of E(1/x) in a series of inverse factorials. It is easy to prove 
by induction that, x > 0, 


es = —.__ a is 4 oa \(e a >) + oe -~+ - (@ JI 


oc ’ 
+ + (@+ 0! 4 Ry (2) 


where 
(12) Rix) = te — IVa + O! 


is the remainder after the first ¢ terms. This is, of course, an expansion in 
Beta functions. It is also a simple special case of the expansion of a funetion 
in a “faculty series” or series of inverse factorials [3] with an exact expression 
for the remainder. 

Let 


7 (n+i\ pg 1 : (" +i\ 5 asi ‘) 
e sj = 2 |.—_—_ 1 = ailiaieaiiabinais 1 = . 
iia 2t)35 3 ( un Pa 


Then, since 


» gn MH a) 
sia = (2)? — “G+ alp® 


the expected value of (11) is 


sit O!s; 1! se (2 = L)Intss 
Pd fen neni cit ace ‘is 
(*) Gti * @+ Detar’ * G@FTDiP 


(t — 1)! nls, - 
he vee G+ lp DR, (x) P(x). 


When developed as infinite series, both (11) and (15) are convergent since the 
remainders R,(x) ~ Oast— ~. 


For computing purposes it is convenient to write 


(16) e(%) » = u; + E(R, (2)) 


in which, since 


° at en’ 
(17) ieee (+? "yes 
\ u oo /- 


(15) 


’ 











54 FREDERICK F. STEPHAN 


the following recursion relation exists between u; and uj_; 
@—1)!n!ls; @ —1lhuii- ki 


“= ati ~  (@tip ? 6743 
(18) 
U = t= 
‘(n+ Ip 
where 
(19) k = npg"/(l — q") ~ np/(e"” — 1). 


This reduces the computing of the uw; to a simple repetitive procedure. The 
computing is still simpler in those problems in which, for the degree of precision 
desired, k is negligible. 

An estimate of E(R.(x)) should be added to the sum in (16) to improve the 
approximation. To determine a suitable estimate, a lower bound for the ex- 
pected value of the remainders may be computed from one of the following 
. inequalities: 











_ «of @ —))Iiz! 
E(Rue)) = 25 Gp PO 
1 _ — m)*\ (t — 1)!z! 
(20) Z z(2 -e - + =") te ee "o 
> Fw — 50 - ua t™ te, m #0 


which is maximized by setting m = {(f — 1)u:41 — tu.}/u:, whence 
(21) E(RAx)) > tui/{(t — lurea — tu}, t > 1. 
Also, since when m = E(z) 


(¢ — 1)!2' 1)!a! 
> — Zz —_ = 
(22) x(a — m) “Gt+ur P(x) < 2(x — m)P(ax) 
a simpler inequality is 
(23) E(Rx)) > tul — q")/np. 
Further, if only the first c < n terms in (20) are taken, 
2 —_ ! 
(24) B(R(x) > UE! pay = Oo. 
z=1 (a + t)! z=1 
where 
a. = &-~ Vm z+ 1p 
(25) "1 = G+ Dea and vz; = aa tbe id. 
An upper bound may be computed ‘from 
tue (26.1) 
1 1 
| 3 tuz + 5% (26.2) 
(26) ER) <j) 1 4 24 4 : m (26.3) 


3 3 











BERNOULLIAN VARIATE 55 


the choice among which may be governed by computing convenience. 
with (16), these inequalities provide lower and upper bounds for E(1/z). 


Taken 


5. Examples. Two examples will serve to illustrate the factorial series (15). 


©XAMPLE | 


Computation of E(1/x) forn = 100 and p = 0.1 


np = 10 k = .000,265 ,621 E(Q1) = .111,527 
Binomial Factorial 
sum of t Sum of t series lower Upper 

t terms terms bounds* bound** 

1 .000 , 295 .098 , 984 .099 ,647 . 132,167 

2 .001 , 107 .108 ,675 .109 ,006 (.111,034) .115,247 

3 .003 ,071 .110 ,548 .110,752 (.111,313) .112,498 

4 .007 ,039 .111 ,082 .111,223 (.111,381) .111,852 

5 .013 ,813 .111,280 .111,385 (.111,452) .111 ,657 

6 .023 ,743 .111,370 .111,452 (.111,478) .111 ,587 

7 .036 ,442 .111,416 .111,483 (.111,489) .111 ,556 

8 .050 , 796 .111,444 .111,500 (.111,497) .111 ,544 

9 .065 , 287 .111,461 .111,509 (.111,503) .111,537 
10 .078 ,474 .111,472 .111,514 (.111,508) .111 ,534 
1] .089 ,372 .111,481 .111,518 (.111,511) .111,532 
12 .097 ,604 .111 ,487 .111,520 .111 ,530 
13 . 103 ,320 .111 ,492 .111 ,521 .111,529 
14 . 106 ,985 .111,495 111,523 .111,529 
15 .109 , 164 .111,498 .111,524 .111,529 
16 .110 ,369 .111,501 .111 ,524 .111,528 
17 . 110,992 .111,503 .111,525 .111,528 
18 .111,294 .111,505 .111 ,525,4 .111,527,5 
19 .111,431 .111,506 .111 ,525 ,6 .111 ,527 ,3 
20 .111,489 .111,508 .111,525,8 .111 ,527,1 
24 .111,526 
100 . 111,527 (end of series) 


* Sum of ¢ terms plus lower bound for E(R(x)) from (24) with ¢ = 3. 
bers in parentheses are calculated from (21). 
** Sum of ¢ terms plus upper bound on E(R(x)) from (26.3). 


Num- 


IEEXAMPLE 2 
Computation of E(1/x) for n = 1000 and p = 0.3 


np = 300 k =9.7 X 10-" 


t Sum of t terms Factorial series upper and lower bounds* 
] .003 ,330 ,003 , 330 

{ oe 

} .003 ,346, 
2 .003 ,341 081, 185 paragon 


.003 ,341,0 (.003,341 ,155,4) 
* Computed as in Example 1. 








56 FREDERICK F. STEPHAN 


t Sum of t-terms Factorial series upper and lower bounds* 
3 si .003 ,341 ,211 

.003 ,341 , 154,817 ore 
4 on .003 ,341 , 156 ,29 

eee oo ,155,56 


on 


.003 ,341 , 155,559 oS 


.003 ,341 , 155 , 57 


For the binomial series, the sum of the largest eight terms of (9), not the 
first eight terms, is approximately .0007 which is less than 1/4 of the 
value of E(1/z). 


In the first example the value of np is almost small enough to make computation 
by (9) convenient. In the second example about 120 terms of (9) must be com- 
puted to obtain an approximation to four significant figures but only four terms 
of the factorial series are needed to obtain seven significant figures. It is evi- 
dent that as np increases, the number of terms of (16) required to obtain an 
approximation to a given number of significant figures decreases. The opposite 
is true of (9) as m increases, or as p approaches a value near 1/2. 


6. Extending the special condition. In some sampling problems all values 
of x less than a specified value, g, and greater than another specified value, h, 
are inadmissible. Then the probability of x in n is 


(27) P(x |g, h) = (") Pd “/Sxor, 9 Sth, 
where 
h 
n 2 on—r 
(28) So.9.4 = ho (") Pq =. 
z=9 
With this new condition, F(1/2) is given by (15) if s; is replaced by 
h . sti n—=z 
a (nti) 
(29) Si,g,h 2. E + ‘) ar 


and the summation in the remainder term is from g to h. Also since 


p ai wa a+¢~1 gti—1 n-g 
Si,gsh Si—1,9,h feed ¢ +. sa ') Pp q 


_— (nti ly ate nai 
( h+z2 )p q } 


(30) 





BERNOULLIAN VARIATE 57 


a recursion relation similar to (18) may be used in computing 


(2 = 1)! n! 8i.¢.n 
Uigh = ——— 





(31) (n+ i)! pi 
it (2 al 1)Uj—1,9.r —¢ — 1)! {k,/(g +> 1 = 1)! oe knsa/(h > i)} 
(n + i)p 
where 
n! p® q” g+1 
32 Bg gee een 
- ‘ — g)!809.r 
hh n—h+1 
(33) ho wee 


(n — h)!809.r- 
The inequalities (20) to (23) inclusive and (26) are applicable to this extension 
on substitution of ui,g, for ui . 


7. Expansion of E(x *) in a factorial series. Equation (11) may be extended 
to other negative integral powers of x. If a is a positive integer 





- b1,081 b2,0 82 

E a a >. =P _— eee, er 

i @*) = (x) (n+1)p + (n + 1)(n + 2)p? 

bras n! 
pec Geb ig + =R; (z)P(2) 
where 

; a? * 2! P(2) 
(35) R, (x) = »? bess. (a + (x + é!2 


and the b;,; are the absolute values of the Stirling numbers of the first kind (see 
Table 2) formed by the recursion relation 


(36) bi, j = bs-1, 1 + (t —= 1)bi-1,;, bi, ; = if I >t @ j <€ &. 
It is evident that 


(37) >, 5:5 = 1! 
j=1 
(38) ba = @ — 1)! and 6b; < i! if g > 1, 
whence 
R.(1) = = j PO 
' (¢ + 1)! 1)! — tha! P() 








58 FREDERICK F, STEPHAN 
Hence R/(x) + 0 and E(Ri(x)) > 0 ast— © and the sum of the first ¢ terms of 
(34) converges to E(x *) ast > ~. 

The following recursion relation corresponding to (18) provides a simple proce- 
dure for computing: 


(i—-1,a/b;_1,0) —k ‘q' 


(n + 1)p 

The computing procedure, then, follows a cycle of four simple operations: 
1. Divide {k/(@ — 1)!} by 7. 

2. Subtract the quotient from {w:~1,2/bi-1,<}. 

3. Divide the difference by {(n + 7 + 1)p} + p. 
4. Multiply this quotient by bi... 


(40) Ui = bias /(% — 1)! = Dia 








The quotient is Uja/Dia. 


TABLE 2 


Absolute values of Stirling numbers of the first kind, b;, ;* 











~~ — l oe ae ———— 


x j | 
.; ] z 3 | 4 | 5 6 
i 
1 1 0 | 0 | 0 | 0 0 
2 1 | 0 | 0 | 0 0 
3 2 3 1 | 0 | 0 0) 
4 6 11 6 | 1 | 0 | 0 
5 24 50 | 35 | 10 1 0 
6 120 | 274 | 225 | 85 15 J 
7 720 1,764 | 1,624 | 735 175 21 
8 5,040 13 ,068 | 13,132 | 6,769) 1,960 322 
9 40 ,320 109,584 | 118,124 | 67,284 22,449) 4,536 
10 362,880 | 1,026,576 | 1,172,700 | 723,680 | 269,325 | 63,273 


* These numbers are also known as differential coefficients of zero [4]. 





The expressions in braces are quantities obtained in the preceding cycle. 
The u;,4 may also be calculated from (18), or checked by such a calculation. 
A lower bound for E(R’(x)) after ¢ terms may be calculated from the first ¢ 


terms of 


E(R'(z)) = >> Ri(x)P(z) > DX Ri(x)P(z) 
z=1 


(41) 


c a 


bis13nl pq” 


-~r 





on 2 2, goth (y + t)! (n a x) Te ?) 


or from an inequality similar to (23) 


(42) 


E(R'(z)) > 


(t — 1)! 


Ut . 


j=1 


be41,3 


E@)y 





BERNOULLIAN VARIATE 59 


which may also be written 


™ E(R'(x)) > ( Teor He) + i)(E(z) +t —1)--- Elz) 


-> bes sey. 


An upper bound may be calculated from 


(44) E(R'(@)) < @ Sad Pe bisa < Ut + Iu 
BR) <LROPO) + LY Vow oy pe 
(45) < DR WP + gE BY - DR wre) 
‘i.e i atl ( + Of +t-1)- _ o- ye base, 


8. The positive hypergeometric variate. The theory of sampling without 
replacement from a finite population rests on the hypergeometric variate. Its 
probability function is 


(46) P(x|N, M,n) = () (* _ “) / ee ). 


In applications to finite sampling, N is the number of items in the population, 
M is the number of them that are of a certain kind, n is the number of items 
drawn for the sample, and x is the number of items of the designated kind in the 
sample. 

As in the case of the Bernoullian variate, it is necessary to exclude zero in 
defining the expected value of 1/x. The probability function of the positive 
hypergeometric variate, then, is 


(47) Py(x) = P(«|N, M, n)/s0, z>0 
where 
(48) ® =i1i- N, M, n). 





Throughout this section the notation will have reference to (47) instead of (1). 
The expected values of positive integral powers of x are 


(49) E(x) = Mn/(N8») 


me aD tH) 











60 FREDERICK F. STEPHAN 
and, in general, 


(51) E(z’) = 2 Si E(a!/(a — j)!) 


where the G? are the Stirling numbers of the second kind and 


‘ — al Mint — 7)! 
- Ba = a> i -sie - pie 


The factorial series corresponding to (16) is 





1 t 
(53) e(%) = 2, : Py(x) = 2 ui + E(R,(x)) 
where 
- — 1)!z! 
(54) “u;= > _ > at P u(x) 
and 
s _t!(a — 1)! 
(55) E(RAx)) = = an ae Pu(x). 
The u; may be computed from 

os (N + 1)s 

“= (M + 1)(n + 1s 
(56) 

7 3 n+ i __ W-M'(N-n)! 
(M 4 1)(n + 1) NI(N «<in«~qg = 1)!(M + 1)(n ra 1)J 


and the recursion relation 


57) « .. 
(57 ™™ W406 +00.°°" 
where 
(58) s = 1— > P(x|N +1,M +i,n + i). 
z=0 
The computing is quite simple in those instances in which 1 — s; is negligible. 


Corresponding to (26), an upper bound for the expected value of the re- 
mainders after ¢ terms may be computed from 


tu, (59.1) 
glu, + 2Pa(1)/(t + 1) (59.2) 
(59) E(Riz)) <§ itu. + 2 Pall) cata) (59.3) 


(str 5G + 1)(t + 2) 


jiu + nC: . i) Pa) op er 9.9) 


z=1 





BERNOULLIAN VARIATE 61 


A lower bound for the expected value of the remainders may be computed 
from one of the following inequalities corresponding to (23), (21) and (24) 


(60) E(Ri(x)) > turNs8o/(Mn) 

(61) E(Ri(x)) > tui/{(t — lua — tuy} 
i t!(xz — 1)! 

(62) E(R,(x)) > a (e+ op! P(x). 


The expected values of other negative integral powers of the positive hyper- 
geometric variate may be calculated from 


(63) Ev) = 2d b:aus/(é — 1)! + E(Ri(2)) 
where 

ie ae a? 2! Px(2) 
(64) Ri(x) = 2d, bes1.3 Fata: 


With P(x) substituted for P(x), (39), (42), (48), (44), and (45) provide lower 
and upper bounds for E(Ri(x)) for the positive hypergeometric variate. Also, 
corresponding to (41) 


(65) ‘E(R'(x)) > . Ri(x)Px(2). 


9. Variance and moments of 1/z and x *. The variance of 1/2, which is 
E(1/2°) — (E(1/2))’, may be calculated from (16) and (34), with a = 2, for the 
positive Bernoullian variate, and from (53) and (63), with a = 2, for the positive 
hypergeometric variate. Likewise, the variance of x“ and the moments of 
1/x and x * about E(1/x) may be computed by the usual formulae. 


REFERENCES 

[1] G. BoHLMANN,? ‘‘Formulierung und begriindung zweier hilfssitze der mathematische 
Statistik,’’ Math. Annalen, 74(1913), 341-409. 

[2] E. T. Wuirraker AND G. Rosinson, The Calculus of Observations, London (Second 
Ed.) 1937, p. 368. 

[3] E. T. Weirraker anv G. N. Watson, Modern Analysis, Cambridge (Fourth Ed.) 1927, 
p. 142. 

[4] Cuartes JorDAN, Calculus of Finite Differences, Budapest, 1937. 


2 The writer is indebted to Dr. Felix Bernstein for the reference to Bohlman. 








RANDOM WALK IN THE PRESENCE OF ABSORBING BARRIERS 


M. Kac 
Cornell University 


1. Introduction. The problem of random walk (along a straight line) in the 
presence of absorbing barriers can be stated as follows: 

A particle, starting at the origin, moves in such a way that its displacements 
in consecutive time intervals, each of duration At, can be represented by inde- 
pendent random variables 


r 
X1, X2, X3,°°° 


Moreover, if at some time the total (cumulative) displacement becomes >p 
(p > 0) or < — q (q = O) the particle gets absorbed. The problem is to deter- 
mine the probability that ‘the length of life’ of the particle is greater than a 
given number ¢. This problem also admits an interpretation in terms of a game 
of chance in which the player quits when he loses more than qg or wins more than 
p. An interesting paper on this type of problem by A. Wald’ appeared recently 
in the Annals. Wald assumes that the X’s are identically distributed and that 
their mean and standard deviation are different from 0.” He is then mostly 
interested in the limiting case when both the mean and the standard deviation 
become small. The object of this paper is to propose a different method of 
attack which in some cases leads to an answer in closed form. ‘The method we 
use has been employed repeatedly in statistical mechanics in the study of the 
so called order-disorder problem. It is due, I believe, to E. W. Montroll’. As 
far as the author knows this method was never used in connection with the 
classical probability theory and this seems to furnish an additional reason for 
publishing this paper. 


2. The simplest discrete case. We assume that each X is capable of assuming 
the values 1 and —1 each with probability 3, and for simplicity sake we let 
At = 1. Note that, unlike in Wald’s case, the mean of X is 0. Denote by N 
the random variable which represents the ‘“Jength of life” of the particle and 
let (m an integer) 


2 
_% m=1 or m= —1, 
— 0 otherwise. 


1A. Wald “On cumulative sums of random variables,’’ Annals of Math. Stat., Vol. 15 
(1944), pp. 283-296. 

2 Since this was written Professor Wald informed the author that he can easily avoid the 
condition that the mean should be zero. 

3 See for instance E. W. Montroll, ‘“‘Statistical Mechanics of nearest neighbor systems,”’ 
Jour. of Chem. Physics, Vol. 9 (1941), pp. 706-721. 


62 





- S ets KF Oo 


oO 


a at OF 


RANDOM WALK 63 
Clearly we have (throughout this section we assume that both p and q are 
integers) 
Prob. {N > n} = Prob. {-q < X1 < p, —q< Xi1+ X2< p,---, -|g 
< Xi +--+: + X, < p} = LT5(m)4(me) --- 5(m,), 
where the summation is extended over all integers m,, m2, --+ m, for which 
-—qoim<p, -—qimtmop-::, -—qim+m+:-:- +m <p. 
Letting 
i=Qtmt+:--+™m;, (j = 1,2, ---,n), 


we see that 


p+q 


(1) Prob {N>n} = D> 64 — Mab — bh) «+: (ln — Ina). 


L1,+++,bn=0 


Let us now consider the (p + ¢ + 1) by (p + q + 1) matrix 


04000 ----) 
+0300 
(2) A = ((6¢ — k))) = |0 4 O 4 0 


ore eee ee eee eee ee eee 


eer ee eee eee eee eee 


It is easily seen that the sum in (1) is equal to the sum of the elements in the 
(q + 1)-st column (or row) of the matrix A”. Thus 
Prob. {N > n} = sum of the elements of the (q + 1)-st column of A”. 
Denote by Ai, Az, -** Ap+qui the eigenvalues of the matrix A and let 


(3) (3) gtd) 
(x1 FOE 1g OO og X p+q+1) 


be the normalized eigenvector of A belonging to the eigenvalue };. It can be 
shown by elementary means’ that 


4 Matrices of type (2) have been introduced and studied in various cornections. Ina 
paper by R. P. Boas and the present author recently accepted by the Duke Mathematical 
Journal references to several authors are given. In order to find the eigenvalues and the 
eigenvectors of (2) it suffices to know that 


! 


‘2c a BH osecn | 

ja ae m+1 _ m+ 
ie @ &§ @e+l_ & P2 
0 0 a 1s! ~~“ 


where m is the order of the matrix p: and pz roots of the equation p? — p + a? = 0. 








64 M. KAC 


and 
= , 
Gi) _ V2 . ajk 
LX SS ——— BI ———————... 
Vpotat2 ptaqt2 
Denoting by & the orthogonal matrix 
(1) (1) (1) 
vy U9 LX p+q4l 
(2) (2) (2) 
w Xe Lpt+q+l 
a Ptaty ety, late | 


and by R’ the transposed of R we have (since the eigenvalues of A are simple) 
by a well known theorem 


A" = R’ - R. 


0 Apt+et J 


It thus follows by an easy computation that the sum of the elements of the 
(q + 1)-st column (row) of A” is 


ptatl ptat+t ‘*" u ptqtl G) ptaqtl a 
) n ; 
So Ne oh =" N a, (S 2). 
T=] 7=1 T=1 
We have 
pt+qtl1 ‘5 ptqt+t ° 
Gi) _ V2 . mr 
Ly — eR sin ee 
ral Vpt+qt2 =! pt+¢qt?2 
0, , Jj even, 


l\Vp+at2 sp eqn’ 204 


and therefore’ 


Prob. {M > n} 


oo "Sn cos” eel sin mig + 1) cot sieve, 
pt+qt+2 i ptq+2 ptqt+2 2p+q+2) 
where the star on the summation sign indicates that only odd j’s are taken under 
account. 
The method just illustrated is quite general but in more complicated cases 
the job of finding the eigenvalues and eigenvectors becomes formidable. 


5 Professor Feller has called the author’s attention to the fact that similar problems and 
formulas can be found in Chapter III of W. Burnside’s Theory of Probubility (Cambridge, 
1928). He also pointed out that the problem could be treated by means of Markoff chains. 


le) 


he 


RANDOM WALK 65 


Professor G. I¢. Uhlenbeck has pointed out that our formula implies a known 
result from the theory of Brownian motion. 

Consider a free Brownian particle which at ¢ = 0 is at x = x(a > 0). R. 
Fiirth® has shown that the probability that between ¢ and t + dt the particle 
will be either at x = 0 or at x = d (0 < a < d) for the first time, is given by the 
formula 


D = —r2pt/ m-+-1)2 e Xo 
dt = D5 (2m + eM amtD® gin an + tes ; 
ad? m=—() d 
where D is the “coefficient of diffusion.’ 
If we treat the one-dimensional Brownian motion as a random walk with steps 
+Ar, each move lasting At, the probability that a particle starting from 2 will 


not have reached 0 or d in the time interval (0, ¢) can be calculated by means 
of our formula. 


We must only put ¢ = x/Az, p = (d — 2)/Ax, n = t/At and assume that as 
both Az and At approach 0 the ratio (Az)’/2At approaches the “coefficient of 
diffusion”’ D. 


An elementary computation shows that in this limit the Prob. {N > t/At} 
approaches 


4,1 : 
#2 o(—78s*D/a?)t ion 


© jai j 
and that the differential of this expression (with a minus sign) gives exactly 
Fiirth’s expression. 


3. General theory in the continuous case. We now assume that the distribu- 
tion function of X possesses a continuous and even density function p(x). We 
have 


Ph. (0 > a} / yn [ oe) <v'« lla lle: + Ulla 
Q 


where the region of integration 2 is defined by the inequalities 
—¢@s nun SP, ~¢@S a+ % SP, ***, —qSsnunt---+%u%SP 
Introducing the new variables 
Ys=Qtut---+2;, (G=1,2,---,n), 
we see that the Jacobian of the transformation is 1 and 
Prob. {N > n} 


(3) pt+q p+q 
. I - I p(y: — Q)p(y2 — 41) +++ P(Yn — Yn-1) dy +++ dyn. 
Consider the symmetric integral equation 
p+q 
(4) [os — oF @ dt = 146) 


6 Ann. d. Phys. 53 (1917) p. 177. 











66 M. KAC 


and note that if K,(s, t) denotes the n-th iterated kernel of this integral equation, 
the right side of (3) is equal to 


pt+q 
[ K,(q, t) dt. 
0 
Thus 


p+q 
Prob. {N>nj= | Kat) dt. 
0 
From the general theory of integral equations we know that 
K.(3,) = DAFF, (nm 2 2), 
j= 


where A, , A2, -:- are eigenvalues and f,(¢), fe(t), --- normalized eigenfunctions 
of the integral equation (4). 

Since p was assumed to be continuous it follows that the eigenfunctions are 
continuous and 


wo p+q 
Prob. {N > n} = Dv S@) I f(t) dt. 


This formula is very general and provides, in a sense, a complete solution of the 
problem in the continuous and symmetric case. Unfortunately the usefulness 
of this formula is limited by the difficulties encountered in solving integral 
equations of the type (4). 

In fact, the integral equation 


1 P 2/2 
Faz, IW at = 4), 


to which one is led by considering the normally distributed X’s, appears to be 
very difficult to solve. 


4. A particular case. If we assume 


p(x) = ; onl 
we are led to the integral equation 
p+q — 
(5) | e*" f(t) dt = 2rf(s),’ 
0 
which is quite easy to solve. 
In fact, rewriting (5) in the form 
pt+q p+q 
(6) e [ eft) dt +e’ [ e'f(t) dt = 2af(s) 
0 0 


7 I have recently encountered the integral equation (5) in solving an entirely different 
problem. A complete discussion can be found in a restricted N.D.R.C. Report 14-305. 





nee ne 


RANDOM WALK 67 
and differentiating twice with respect to s we obtain the differential equation 
1 
"(6) + (} 1) 40) = 0. 


Substituting the general solution of this equation in (6) we find in an entirely 
elementary fashion that 





1 ‘ 
y= 
3 1+ y’ 
sin y;t Yy; COs y;t 
f() = oe 


Vit (pt gl + yi)’ 


where y; is the jth (positive) root of the transcendental equation 


- ga 
(7) tan (p + q)y = ie 





hee 
We have 


prq : 1 , 
I (sin y;t + y; cos y;t) dt = re {1 — cos (p + q)yj + y; sin (p + q)y;} 
3 
and it is easily seen that (7) implies 


a 


tol ws. to 
- 


0 if cos (p + gy; = 
1 — cos (p + q)y; + yj sin (p + Q)y; = 
2 if cos (p+ gy; = — 





a 
<= 


no 


! 
1 
1 


—— 


Ss 





1 


+ 
° 
~~. 


Finally, 





“ 1 sin yg + Yj COS Y;9 
Prob. {N > nj = 2 . nn 2.1» 
, ! y (i+ yj)" y{lt+ i~o+Q0+y95)} 
where the dash on the summation sign indicates that only those 7’s are taken 
under account for which 


1 — yj 
cos (p + g)y; = ———}. 
1+ 9j 
We omit here the discussion of various limiting cases inasmuch as our main 
purpose was to obtain exact formulas. 
There are indications that some of the limiting cases are related to singular 


integral equations with continuous spectra. We may return to this subject 
at a later date. 











ON THE CLASSIFICATION OF OBSERVATION DATA 
INTO DISTINCT GROUPS 


By R. v. MisEs 
Harvard University 


Introduction. In scholastic examinations as well as in the examination of 
industrial products the following probability problem arises. The individuals 
of a certain population are successively subjected to trials each of which leads 
to a definite score x (one real number or a group of m real numbers). Each 
individual is supposed to belong to one of n classes. These classes are character- 
ized by n probability densities pi(x), po(x), --- pa(x). One has to decide on 
the basis of the observed value x to which class the respective individual belongs 
and one wishes to make this decision with the smallest possible risk of failure. 

For example, let us consider an examination where the three grades A, B, C 
are attributed on the basis of a simple score x (case m = 1,n = 3). It may be 
assumed that an individual of the class A has a mean expected value of x equal 
to 3, = 75 and a normal distribution with the standard deviation 0, = 4/+/2. 
The analogous values for the classes B and C may be 32: = 50, 02 = 8/+/2 and 
33 = 25, 03 = 12/+/2. In this case, the solution developed in the present paper 
allows the conclusion that the best way of grading would be to attribute the 
grade A to scores « beyond 70.0, the grade C to scores below 40.0 and B to the 
rest. The corresponding error risk will be 3.9% or the success rate 0.961. 

There exists, of course, one case where the solution is trivial. If the probability 
densities p,(x) are limited to n non-overlapping regions R, (with p, = 0 at points 
outside R,) an obvious decision can be made without any risk of failure. An 
assumption of this kind underlies the usual procedure of grading. If, in the 
foregoing example, an individual of class A is supposed to have at any rate a score 
beyond 60 and a class C individual less than 40, it is obvious how the grades 
should be attributed without incurring any risk. It seems, however, that in 
many problems the assumption of normal distributions or some other kind of 
overlapping distributions is more appropriate. Then, the probability problem 
has to be solved. 

The solution submitted in the present paper is derived from the simplest 
principles of calculus of probability without any arbitrary assumption or hypothe- 
sis. If nm equals 2, the problem can also be considered as a problem of testing 
a simple statistical hypothesis with a two-valued parameter.’ It has been 
shown in an earlier paper’ that under this restriction success rates higher than 
50% are obtainable. 

1See A. WALD, Annals of Math. Stat., Vol. 15 (1944), p. 145. Here, both p; (x) and pe (2) 
are supposed to be normal distributions with the same covariance matrix. The problem 
treated by Wald is different from the one considered in the present paper since in Wald’s 
paper the parameters of the two multivariate normal distributions are assumed to be 
unknown. 

2R. v. Mises, Annals of Math. Stat., Vol. 14 (1943), p. 238. 


68 





CLASSIFICATION OF OBSERVATION DATA 69 


1. Statement of the problem. For each of n classes of individuals a prob- 


ability density p,(x), v = 1, 2, --- ,is given. We subdivide the m-dimensional 
x-space into n regions R, , R., --- R, and assign the region R, to the vth class. 


The probability, for an individual of this class, to have its x-value falling in 
R, is 


(1) -_ / p(x) aX, ca hao«s 
(Ry) 


where dX denotes the element of the z-space (dX = dx in the case m = 1). 
In the N first trials of the indefinite sequence of trials, N, individuals that 
belong to the vth class will be tested. Out of these only those individuals whose 
z-value falls in R, will be ascribed to the vth class. Their number according 
to the definition of probability, equals N,(P, + «,) where e, tends towards zero 
as N, goes to infinity. The total number of correct decisions during the N first 
trials is therefore 


(2) WP, + €1) + NAP. + €2) res N,(P, + En) 


and the relative number is 
N N Nz 

(2’) —) (Py + «) + (Po + @) + +: = (Pa + en). 
N N N 


If N increases indefinitely a part of the N, must become infinite. For these 
classes, «, converges toward zero. For the other classes N,/N diminishes to 
zero. Thus, the relative number of right decisions converges towards 


(3) yy (NiPi + NoP2 ~ + NuP»). 


The N, are unknown. Every one of these unknowns can take each value from 
zero to N. If P, is the smallest P,, the most unfavorable case, where the 
expression (3) has its smallest value, will occur with V, = N, all other N, being 
zero. ‘This value is obviously P,. Thus it is seen that the frequency of correct 
assignments is at least equal to the smallest P, which may be written as Prin. 
The greatest risk of making a false decision is 1 — Pin . 

Now the problem to be solved in the present paper can be stated as follows: 
For n given densities p,(x), find the subdivision of the x-space into n regions R, 
that gives to the smallest of the expressions P, defined in (1) its possibly greatest 
value. 

This problem has the type of a continuous variation problem with the integrals 
in question bounded within the limits zero to one. We may, therefore, assume 
that under reasonable restrictions for p,(x) a solution exists. Uniqueness of 
the solution cannot be expected in general. It seems very difficult to establish 
the conditions for unicity in other than the most simple cases. Existence of 
more than one solution would mean that each of them is an optimum with 
respect to infinitesimal modifications of the boundaries. 


* 








70 R. V. MISES 


2. General solution. A simple problem of variation is considered as solved 
in principle when the nature of the extremals is known. In our case of a so- 
called minimax problem, where the minimum of n quantities is maximized, an 
additional relation between the n integrals is required. Both can easily be 
found in the actual case. 

Let us first consider a partition of the x-space into n regions with not all P, 
being equal. The smallest P, will be called Prin and the smallest but one P*, 
Among the k regions for which P, = Pin there will be at least one, say, R. that 
has a common border with a region Rs whose P-value is greater, so that Ps = 
P*,. Now modify the boundary between R, and Rg in such a way that the space 
covered by R, is increased and that of Rs decreased. According to (1) the new 
values of P, and Pg, will be 


(4) P.=P.+A, Ps=Ps-—A’ 


with both A and A’ positive. The two quantities A and A’ are not independent 
of one another, but they can be chosen both smaller than any given positive 
number e. Therefore, the condition 


(5) Pw =PatA<P;— A’ =P; 


can be fulfilled. All other P,-values remain unchanged. 

In the case k = 1, that is, if only one region R, had originally the minimum 
P-value, the modified system has a greater minimum P, which equals either 
P,.+AorP*. Ifk > 1 the new system has the same minimum P as the original 
one, but its k-value is diminished by one. If we repeat the same procedure 
(k — 1) times we obtain a system of regions with one single P, having the mini- 
mum P-value and the next step leads to a partition of the x-space into n regions 
with a smallest P-value that is greater than the original Pmin. Thus it is seen 
that no partition with unequal P,-values can solve our problem. 

Secondly, if m > 1, consider a system of n regions with P = P; = P2 = -:+ = 
P,. Take two points, x and y, on the border of any two neighboring regions 
R,and R,. An infinitesimal variation of the boundary would consist of adding 
to R, in the neighborhood of the point x a space element 6S subtracting it from 
R, and, at the same time, adding to R, in the vicinity of y an element 6S’ sub- 
tracting it from R,. Then, according to (1), the new values of P, and P, will be 


(6) P, =P + p(x)6S — p,(y)6s’ 
P), = P — p,(a)6S + p,(y)sS’. 


Introducing A, = P, — P and A, = P,, — P, these equations solved for 6S and 
5S’ give 


, _ Poly)Au + pyuly)Ar » _ Pro(x)Ay + pu(a)A, 
(7) iS = a, iS’ = ne 


where 


(7’) D = p,(x)p,yly) — pu(x)p,(y). 








CLASSIFICATION OF OBSERVATION DATA 71 


If the determinant D is positive, we find two positive quantities 6S and 6S’ 
for any pair of positive A, and A,. If D is negative the same is true when x and 
y are interchanged. In both cases, that is, with D ¥ 0, the original partition is 
replaced by a new system of regions in which only two regions, RP, and R, , have 
increased P-values, while (if n > 2) still Pnin = P. If to this system the pro- 
cedure as described in the foregoing is applied, a final partition with a greater 
minimum value of P can be derived. The conclusion is that no solution of our 
problem can include a boundary on which the determinant D is different from 
zero for any two points x and y. On the other hand, it is seen that D = 0 means 
that the ratio p,(x):p,(x) has a constant value along the border. Thus the 
result is reached: 

The partition of the x-space that solves our problem is characterized by two proper- 
ties: (1) for all n regions R, the value of P, is the same; (2) along the border between 
R, and R, the ratio p,(x)/p,(x) is constant. 

In the one-dimensional case (m = 1) only the first of these two statements is 
relevant. In any case, the success rate, that is, the guaranteed ratio of correct 
decisions, equals the common value of all P,. 

3. Illustrations. (a) One-dimensional case. Upon introducing the cumula- 
tive distribution functions 


() Fox) = | polede 
the conditions P; = P, = --- P,, take the form 
(9) F(a) — F (22) = F(x) SSS) ae Fy-1(tn-1) ree Fy_1(Xn-2) =l1- F,(2n—1) 


where 21, Y2, *** n—-1 determine the 7 intervals on the both-sides infinite z-axis: 
If all density functions have the same form except for an affine transformation 
one has 


(10) F(z) = Flh(x — 8], v= 1,2,---n 


Let us assume, for instance, that scores between 0 and 100 are attributed to 
three types of individuals. The first type may have an even chance to obtain a 
score between 0 and 50, the second between 40 and 80 and the third between 
70 and 100. Here 


. : L 

(11) F(z) = 4+ (x — 3)p, lc-—v,| dp, 

with 3, = 25, 60, 85 and p, = #5, 20, se. The conditions (9) supply 
1 a~-% 1, _.,.1 ”-& 

(12) es ae en" ee 


and this, solved for 21, 22 gives 2; = 41 3, x2 = 75 while the three expressions 
(12) take the value 0.833. Therefore, in attributing all scores below 413 to the 
first class and all scores beyond 75 to the third one is safe to make under no 
circumstances more than 2 incorrect decisions in the long run. 











ia R. V. MISES 


In the example quoted in the introduction one has 
1 seabed 
Le as —_ ——. (a—0 y) 2/202 
( ) Pp (x) oy\/ Qe é 


with 3, = 75, 50, 25 and o} = 8, 32,72. If (x) denotes the integral 


&(x) = = [ e* dz 


the conditions (9) become 


; a — 25 = 2 — 50 CS 21 — 50 oe a x2 — 75 
ay 14 0(25%) = 0258) - 0(2 5%) 2 1 0(223). 


The first and last expression equated lead to 21 + 32. = 250. The complete 
solution can be found with the help of tables for . It is 2} = 29.9920, 2. = 
70.0027 with the common value twice 0.961 for the three expressions (14). 
Hence the result as quoted in the introduction. 

Let us now take up the case of six normal distributions with equidistant 
mean values 9 = -ta, +3a, +5a and one and the same variance o. Then, 
because of symmetry, two equations only have to be fulfilled: 


1 oS) = o AEB) — 2) ~ ota) 82) 


2 2 . . . 
For o /a = 0.32, the numerical solution gives 

















x, = —4.160a, X2 = —2.062a. 


The success rate, i.e. half the common value of the above expressions is 0.931. 
The six intervals extend from — © to 2, from x; to x2 , from 22 to 0, from 0 to 
—22, from —2, to —%, and from —2 to ~. 

(b) Case of more than one dimension. Let us assume that two classes A and 
B have uniform distributions extending over volumes V; = 1/pi and V2 = 1/pe 
respectively. If the two regions have a common part of volume V each surface 
within the common space fulfills the condition p:/p2 = constant. Thus, the 
two regions R,; and R, are not uniquely determined but subject to one condition 
only which determines the optimum success rate. If xV is cut out from V; and 
(1 — x)V from V2, the relation must be fulfilled: 

- , ° 2 V 
1 — pi Ve 1 — pV(l k), 1.€. K Pe a 
and the success rate is 





V ; 
S=1—pVe=1- en =1—pV(l — «). 


If three classes A, B, and C are considered with the densities py = 1/V1, pe = 
1/V2, ps = 1/V3 and the first two regions have a space of volume V in common, 
the latter two a space of volume V’, the conditions are 


1 — piV(l — x) = 1 — poxV + AV’) = 1 — pall — ADV’ 





CLASSIFICATION OF OBSERVATION DATA 73 


which supply 


— Pp2 + ps V+V’ 
Pip2 + Pops t+ ps~i V ’ 
. Pi P2 Ps y+? 





Pi P2 + Pops + Ps P1 V’ . 


and the success rate has the value 
Sai~ V4.2, 
( ) th + Paps + Pah 


If the p, are normal density functions, say 


a 
Pr (x, y) = “a e ~ 


Q, == ap (x se ay)” + 26, (x ae ay) (y sana b,) + > (y ia b,)° 


and D, the corresponding determinants, the curves separating the regions R, 
are the conics 


Q, — Q, = const. 


where the constants are determined by the conditions that all P, must be equal- 
If the a, 8, y have the same values for every v, the borders consist of straight 
lines. In this case one can reduce the expressions for p, , by an affine transforma- 
tion, to 


1 (x—ay)2—(y—b,)2 


p(x,y) = —e 


r 
In the transformed plane the borderline between the regions R, and R, is per- 
pendicular to the straight line that connects the points A,(a,, b,) and A,(a, , 
b,). If all points A, lie on the same straight line (in particular, if nm = 2) the 
whole problem is practically identical with the one-dimensional (m = 1). In 
the case n = 3, in general, the three regions are confined by three lines per- 
pendicular to A1A2, A2A3, A3A1 passing through a point C whose coordinates 
are determined by the equations P} = P, = P;. If r, denotes the distance 
A,C and ¢,, 3, are the angles, A, C forms with the adjacent sides of the triangle 
A,A2Az3 one has to use the function 


1 0 
F(r,qi) = oe I o(r — z tan ee dz. 
Then the two conditions for C read 


F(r,, ¢1) + F(r,, 01) = F(re, g2) + F(r2, 32) = F(rs, ¢3) + F(rs , 3) 


and the success rate equals 0.5 plus the common value of these three expressions. 





ON AN EXTENSION OF THE CONCEPT OF MOMENT WITH APPLICA- 
TIONS TO MEASURES OF VARIABILITY, GENERAL > 
SIMILARITY, AND OVERLAPPING' 


MILTON DA SILVA RODRIGUES 


State University of Séo Paulo 
1. Introduction. Given a frequency distribution D: [X;, F,] @ = 
1, 2, 3, --- , »), we shall call the expression 


M,(D, X;) = du (X; — X;)'F; 


the rth total moment of D about the origin X;. We shall consider the weighted 
sum 


M, — 2j W; M, (D, X;) 


where W ; denotes the weight corresponding to the particular origin X ;, and the 
summation is over a field ¢. In particular, if ¢ is the set of all values assumed 
in D by the variate X;, and if W; = F;, we shall call the quantity the rth com- 
plete total moment of D. If, on the contrary, W; is the frequency F’; of the value 
X; in a second frequency distribution D’: [x; : F'] and ¢’ is the set of all values 
assumed by the variate X; in D’ , M, will be called the rth aggregate moment 
of Dand D’. A modification of this procedure leads to what we shall call the 
moment of transvariation of D and D’. 

The consideration of complete moments draws attention to certain previously 
known measures of variability which are independent of the origin selected, 
and also provides simple methods of computation which are useful for data 
given in the form of a frequency distribution. The investigation of aggregate 
moments and moments of transvariation gives rise to certain measures of general 
similarity between two distributions, as well as measures of the amount of over- 
lapping. 


2. Sliding and complete moments of a frequency distribution. 
2.1. We shall give the name sliding total moments of order r to the successive 
values, for particular values of 7, of the expression 


(2.11) M,(X;) = F; du [(X; — X,)’ Fil. 


1 The Portuguese original of this paper was written in Brazil, in August 1943. Its transla- 
tion into English was entirely revised by Dr. T. Greville, Bureau of the Census, who pro- 
posed also many simplifications in the derivation of formulae. For his painstaking labor 
and interest I wish to express my very sincere appreciation. I also wish to thank Dr. 
W. Edwards Deming for reading the manuscript and making several valuable suggestions. 


74 





EXTENSION OF MOMENT CONCEPT 
The expression for the complete total moment, written out in full, is 


(2.12) m= 4 M.(X) = & OU — X)' FFA. 
j7=1 1 j=1 


It is readily seen that the complete moment is independent of the choice of 
origin. 
2.2. If r = 0, we have 


Mo (Xj) = F; >» F;. 
i=l 
The complete total moment of order zero will therefore be 
(2.21) Mo = DF; Fi = Mo 
j=1 i=1 


where M, stands for the total moment of order zero about the origin of the X’, 
that is, 
My = Nw. 
2.3. If r = 1, we shall have 


is (Xj) = Fy Dy (Ke — XP. 


Using M, to denote the total moment of order one about the origin of the X, 
we obtain 


M, (Xj) = Fj 2) Xi Fi — X;F; >) Fi = Fj; Mi — XjF;Mo. 
l 1 
Making j vary from | to n and summing, we have 


Mm, = >> F;M, — > X;FM, 
(2.31) i=1 i=l 


MM, ve M,M, = (). 


This result is due to the fact that we took the deviations X; — X; with their 
proper signs. We may, however, calculate the value which the complete moment 
of first order would have if using absolute values. Thus, the sliding total 
moment thus modified becomes 


| M1(X,) | = F; |= (X; al X;)F; + > (X; oa xr | 


which may be put in the form 


(2.32) | 1,(X,) | = rix,[ F;- > F| - p> F;X; — . FX; 


= tj 








76 MILTON DA SILVA RODRIGUES 


Summing with respect to 7 and employing the substitutions 


n j-l 
> Fi = Mo — > F; 
i=) i=l 


(2.33) : 
D> F:X; = Mi — DF: X; 
t=] 


gives for the complete total moment 
n j-1 n j-l 

(2.34) \M,| = 2>5 ee Z F.| — 2 E > F:; x.]. 
<—t — é 


The quotient 


8 





(2.35) m = M, 


of the complete total moment of order one by the complete total moment of order 
zero we shall call the complete unit moment of order one, or simply the complete 
moment of order one, when.no confusion would result. 

The complete unit moment is a measure of variability, identical with that 
already considered by Andrae and Helmert, respectively in 1869 and in 1876, 
and which C. Gini, in 1912, called mean difference with repetition.” 

The numerator of m, is easily computed if we observe that the upper limit 7 — 1 
of the ’; summation, for example, means that each product X ;F ; must be multi- 
plied by the cumulative frequency corresponding to the class immediately pre- 
ceding. We only have to shift the cumulative frequencies column by one class 
in the proper direction; the second term is similarly dealt with. 


2.4. The second order sliding total moment is 
M.(X;) = Pr; 2 [(X; _ X,)°F il = I; M, == 2F;X;M, + F; X} Mo 
t=1 


where Mz is the total moment of order two. Summing with respect to 7 gives 
the complete total moment of order two 


(2.41) M, = >> M.(X,) = 2(M@2My — Mi). 
j=1 ° 


The complete unit moment of order two is therefore 


_o[M (MM, ‘|= ron 


9 


= Zo" 








2ApuD CzuBER, Wahrscheinlichkeitsrechnung, Vol. 2, (1932), p. 316. C. Gin1, Varia- 
bilita e Mutabilita, Cagliari, 1912. 





We 





EXTENSION OF MOMENT CONCEPT 77 


where v’ stands for a unit moment about the origin of the X, namely 
, _X'F 
y 2 oe 
eo 


mz is also a measure of variability, independent of the choice of origin. It is 
equal to the square of Gauss’s ‘‘Prazisionsmass’’, and to the double of Fisher’s 
variance; like m, it was defined by Andrae and Helmert, and was called by Gini 
the mean square difference with repetition. 


2.5. If r = 3 we have for the sliding moments, 


M3(X;) 


F; 0 (X: — X))*F; 
i=1 


Il 


F;M; — 3F;X;M2 + 8F;X2M, — F;X?Mo. 


Summation over 7 gives 
(2.51) Ms = dj M3(X,;) = MoM; — 3Mi M2 + 3M2Mi — MgM, = 0, 
j=1 


a result which is easily shown to hold for any complete moment of odd order. 
We may calculate the value of the complete moment of order three using absolute 
values of the deviations X; — X ;by a process similar to that previously described 
for the calculation of | 9 |. This gives 


ime] = 2[ XID 38D XL x 

(2.52) =o egg ee on aii 
+3 2», F;X; » F;X? - » Psd rixi]. 

2.6. The sliding moments of order four are 
MAX) = F jM, — 4FjX jM3 + 6F ;X3M2 — 4F jX$M, + FjX4M). 

Summing with respect to 7 and simplifying, we have 

(2.61) Ms, = MoM, — 4M\M3 + 6M? — 4M3M, + MM 

= 2(MoM, — 4M.M; + 3M). 


Dividing both sides by Yt in order to obtain the complete moment on a unit 
basis, we have 


M 4M M; = ' 


But, if v indicates a moment about the mean 


, 2 2 7 14 
v4 = vg — 4103 + Ory v2 — 3. 





78 MILTON DA SILVA RODRIGUES 


By substitution, therefore 
my = 2(¥4 + Sve’ — Gri're + 3y:') 
(2.62) Q[vs + 3(v2 — v1°)"] 
= 2(v5 + 372). 


This complete moment gives rise to a measure of kurtosis independent of the 
choice of origin 


° . ° . 4 2 

In case of mesokurtosis this reduces to 3, since for the normal curve v /»2 = 3; 
leptokurtosis and platikurtosis occur for the same ranges as in the case of Pear- 
son’s measure Be. 


3. Aggregate moments of two frequency distributions. 

3.1. Given two frequency distributions, D:[X;, F;|(¢ = 1, 2,3, --- , ») and 
D’: [X;, Fj]G = 1, 2, 3, ---, p) and a fixed point X; belonging to the second 
distribution, we shall call the expression 


(3.11) MD, Xj) = Fj 2) (Xi — X))'F: 
*=1 


the rth aggregate sliding total moment of the first distribution about the element 
X; of the second. Summation over j gives 


Pp n 
(3.12) “Mm, = > >) F(X; — X;)' F;. 


7=1 t=1 


We shall call “Mt, the aggregate complete total moment or, simply, the aggregate 
total moment of D about D’. It is clear that this is a symmetric function of the 
two distributions, except for a change of sign in the case of odd moments. 


3.2. If r = 0, we have 


(3.21) M,(D, X}) = F; >> Fi; 


*=1 


(3.22) ‘mM, = >> F; OF; = MoMy. 
7=1 t=] 


3.3. Ifr = 1, we have 
(3.31) M,\(D, X}) = F;M, — F;X};Mo 
(3.32) Mi = MiMy — MoM:. 
We shall call the quotient 
(3.33) 





EXTENSION OF MOMENT CONCEPT 79 


the aggregate unit moment of order 7 (or the aggregate moment coefficient), 
or simply the aggregate moment of order r whenever the simpler name will not 
cause confusion. 

It is obvious that the aggregate moments are measures of general similarity, 
as to form and,position, between D and D’. This similarity will be an identity 
in case the two distributions coincide perfectly; on the other hand, it is clear that 
there is no limit to the degree of non-similarity which may be encountered. We 
shall take unity to represent the maximum and zero the minimum of similarity, 
and thus define a provisional similarity index 


m mi 
(3.34) S=—>; 
mM, 


But 
: M.M, — MM; 
my, = Se 
MM, 


where A and A’ stand for the arithmetic means of D and D’, respectively. Now 
it will be seen that if A = A’, S = «. This result is due to the fact that in the 
calculation of m and m; we took the absolute values of the deviations X; — Xj, 
while in the calculation of “m, we retained the algebraic signs. In order to make’ 
the two terms of the fraction in (3.34) comparable, we can either: 1) calculate 
“m, also using absolute values; or 2) take only the positive or only the negative 
part of both numerator and denominator of S. In any case, A = A’ is a neces- 
sary condition for the maximum of S, 


=A-A’ 


3.4. We shall employ the first method suggested above, although we shall 
return to the second in the third part of the paper. As long as D and D’ do not 
overlap, all the X; — X; deviations have the same sign and this is the same as 
that of the difference A — A’. If, however, there is some overlapping this will 
not be the case, some deviations having different signs from that of A — A’. 
This brings us to Gini’s concept of ‘‘transvariation”. He applies this term to 
any deviation X; — X; which does not have the same sign as X — X’, these 
symbols denoting averages of any previously specified type; and he calls the 
magnitude of the deviation its “‘intensity”’. 

In computing the complete moment of the first order using absolute values, 
in order to simplify the algebra we shall assume the same origin for X and X’ 
and therefore drop the stroke from the X, but not of course from the F. 
If certain values of X occur in one distribution and not in the other, we can 
merely consider the frequency as zero in the second distribution. In this way 
the two distributions can be regarded as extending over the same total range. 
If X, and X,, denote the extreme values, the sliding total moment is 


j—1 ™ 
|M\(D, X;)| = F; P (X; — Xi)Fi + 2a = x)F| 


j-l ™ j-l m 
= Fx,(E “= > f) — F; (= F;X; - > FX). 
‘= ‘= t=) 


t=7 











80 MILTON DA SILVA RODRIGUES 


Summing with respect to j7 and at the same time employing the substitutions 
(2.33) or their transposed form, we obtain the following alternative expressions 
for the complete aggregate moment: 


m 


j-1 m j-1 
(3.41) |M| = WI — Mi +2) [Fix F.| —-2> FD PX, | 
j=1 t=1 j=l i=1 


J=1 


(3.42) |°M1| = MoM: — MiMy— 25 E X;>> F.| +2> Fi DF: x,|. 
i=] j=l i=} 


Note the similarity of the first of these forms to formula (2.34) which is in fact 
a particular case of formula (3.41). Alternatively, we may obtain from formula 
(3.42) the particular case 


(2.34a) |M1| = 25 (7, > F: X:) —2)>>5 (F; x, z. F.) 
j=1 t=) j=1 t==j 


which is equivalent to (2.34). 

If the two distributions do not overlap, | “Mt: | does not differ numerically 
from ‘32;. Let us consider the case in which there is actual overlapping, the 
range of non-zero frequencies extending from X, to Xn, for D and from X,,4; to 


Xm for D’. Then formula (3.42) becomes, upon merely dropping all vanishing 
terms 


|M| = MoM; — Mi My 


(3.43) n+p , 1 n+p : n+p 
23 ee . r.|+ 2 E 3} F:X,. 
j=ntl i=n+l j=nt+l oar 


On the other hand, formula (3.41) reduces, under the same circumstances, to a 
much less simple expression, which upon making the substitutions (2.33) and 
simplifying reduces to 


n+p j—1 
|| = MMi — 1M, +2 Dd) ee p> r.| 


j=n+l i=n+l1 
ntp P n+p 
(3.44) —2 )j gE >| F; x, | 
j=nt+l t=] 


n+p n+p n+p n+p 
—~2 >) F}X; >, Fi t2 Dj F; DS PiXi. 

j=ntl i=nt+l j=nt+l t=n+1 
This result may be arrived at somewhat more easily by merely making the sub- 
stitutions (2.33) directly in formula (3.43). It may be noted that formula 
(3.44) at once reduces to the form (2.34) if the two distributions are identical, 
since the additional terms all cancel. It is, however a less satisfactory result 
than formula (3.43) because of the larger number of terms it contains. In order 
to obtain a formula which resembles (2.34) more closely, we may reverse the 








WO OS 


d 


EXTENSION OF MOMENT CONCEPT 81 


order of summation in formula (3.43). Observing that the terms for 7 = 7 
collectively vanish, we see that 


M| = MoM; — MiMy 





(3.45) n+p i—1 . n+p gut ; 
—-2)>) E ps FX, |+2 a | FeXs 3 Fi]. 
i=n+l j=n+1 t=n+1 j=nttl 


It will be seen that the simple method of numerical computation described in 
section 2.8 is immediately applicable to all the formulas (3.41) to (3.45). Di- 
viding any of these expressions by “M) gives |“m|. For example, if formula 
(3.43) is used, we have 





(3.46) “my | = A’—A 
2 n+p . n+p n+p , n+p 
— aa) by Fi Xi Fs | — QU Fi Qa FP: Xe] >. 
My Mo 7=n+1 t=j i=] =] 
Substituting this value in equation (3.34), we have 
mm 
AZ ys 
(3 47) 1 | cm |? 


a quantity which we shall call the ‘“‘mean coefficient of similarity.” 

We now observe that S; is a general measure of similarity whose magnitude 
is affected by differences in either form or position. It may, however, be de- 
sirable to eliminate the position element, in order to isolate the form aspect. 
To do this it will suffice to relate the value which | “m; | would have for A = A’, 
to the product mym;. This value of | °m:| is, in fact, its minimum; denoting 
it by “u: we obtain the index 


/ 
(3.48) 6=-—F 

Mi 
which we shall call the mean similarity ratio. 

It is clear that all the above mentioned indices measure overlapping as well 
as similarity. Overlapping between two distributions will be greatest when 
their similarity is greatest, or when |°m,|is a minimum. In order to bring 
out more clearly the overlapping aspect we may follow Gini’s procedure of con- 
trasting the actual value of a measure with its maximum value. As already 
pointed out, if the form of the two distributions is held constant, but their rela- 
tive position is varied, the degree of overlapping, as measured by the mean simi- 
larity ratio, is greatest when the arithmetic means coincide. This method of 
procedure is embodied in the index 


(3.49) T= = 


which we shall call the “intensity of transvariation or overlapping.”” To calcu- 
late “u; we may, for example, merely add the difference A’ — A = c to the X 





82 MILTON DA SILVA RODRIGUES 


values, in order to move D along the X-axis a distance of c, and then proceed to 
calculate | “m, | in the usual manner from the adjusted X values. 


3.5. If, in (3.11), r = 2, we have 


M.(D, X;) = F; >> (X; — X;*F; 
t=1 


F;M, — 2X;F;M, + X}?F;Mo. 

Summing for j then gives 
(3.51) “M, = MoMz — 2MiM, + MzMp. 
If we define the second aggregate unit moment as 

my = tt 

Mo 
then 
‘ M, M.M; , M; 


Me 


(3.52) ~ Mh “MM, MM 
=o+o7+(A—A’?, 
where the o and the A stand for the standard deviations and the arithmetic 


means of the respective distributions. Now we define the ‘‘mean square co- 
efficient of similarity”’ as the value of 


, 
M2 Me 
S: = 
a 4 
Mo 
(3.53) 
4o* o” 
= f2a.e tA wD Anz’ 
[o? + o? + (A — A’)??? 

It is obvious that a minimum value of S2 requires that A = A’ as a necessary 
condition for the maximum degree of overlapping. Maximum similarity re- 
quires, in addition, « = o’, in which case S. = 1. 

For a measure of similarity which is independent of difference in position be- 
tween the two distributions, we define. 


(3.54) 


where “ue is the minimum value of “mz for all positions of the two distributions, 
without changing their form. This is obtained by merely taking 


(3.55) ‘m=o +o”. 


For a measure of overlapping we can follow Gini in contrasting the actual 





EXTENSION OF MOMENT CONCEPT 83 


value of “m2 with its minimum “ye , since the maximum of overlapping corresponds 
to the minimum value of “‘m.. We thus set 


“up o + o” 
56 2 = — = —_. 
_ * im ~ Fo? + (A AP 
a measure which we shall call the “density of overlapping”. Its maximum 
value is unity. 

It may be remarked that all the indices proposed in this paragraph are easier 
to calculate than those of paragraph 3.4. The individual terms are all functions 
of only one of the two distributions; yet the resulting indices are independent of 
the origin chosen, and therefore free from any criticism based on doubt as to the 
representativeness of the arithmetic mean, in cases of marked skewness. 


4. Positive and negative moments, and moments of transvariation. 


4.1. The aggregate sliding total moment of two frequency distributions D 
and D’ may be expressed in the form 


7-1 m 
(4.11) M,(D, X}) = F; DX (X; — Xj)’ Fi + F; a (X; — X;)' F; 
i= t=) 


when both distributions have been artificially extended, if necessary, to cover 
the same total range, as previously described in section 3.4. We shall char- 
acterize the second term in the right member of (4.11) as the positive sliding 
moment, and the absolute value of the first term as the negative sliding moment. 
We shall denote these moments by *M,(D, X;) and ~M,(D, X;). The complete 
moments obtained by summing these separate terms over the range of values of 
j we shall call the positive and negative aggregate complete moments. Thus 
the positive complete moment is 


(4.12) rm, = > | F; > (X; — x)’ Fs| 
i=jt 


j= 
and the negative complete moment is 


. m j-1 
(4.13) “MR, = dX Fi dX (X; — xy Fs]. 
oe = 
That one of these two partial moments which is obtained from differences X; — 
X; having the opposite sense to that of the difference X — X’ will be called the 
moment of transvariation of the two distributions and will be denoted by the 
symbol 79,. Here, as in section 3.4, X and X’ denote averages of any pre- 
viously selected type. For example, if the arithmetic means are the averages 
selected, and if A — A’ is positive, then the negative aggregate moment is the 
moment of transvariation, and vice-versa. 
In the trivial case in which the two distributions are identical, the positive 
and negative complete moments are equal, and both reduce to merely one half 





84 MILTON DA SILVA RODRIGUES 


the aggregate complete moment (computed by the use of absolute values in the 
case of moments of odd order). 


The unit moment of transvariation will be defined as 


(4.14) 


4.2. It is evident that the moments of transvariation can be considered as 
measures of overlapping. Any such moment equals zero when there is no over- 
lapping and becomes greatest when the two distributions coincide. Taking unity 
to represent the maximum and zero the minimum of overlapping, we may choose 
as a general measure of overlapping, 

4"m? 47M; 


[| m:| | mee] | mer |” 

It will be seen that this quantity always equals zero when there is no overlapping, 
and equals unity when there is complete overlapping: that is when the two dis- 
tributions are identical. 


(4.21) T; 


~ [m, 


5. Need for further developments. All of the measures above described 
were defined for the case of finite sets of magnitudes, expressed as frequency 
distributions D and D’. Now these sets of magnitudes may be thought of as 
samples drawn out of their corresponding universes. The consideration of these 
universes would lead to more general representations under the form of frequency 
functions, and the above measures would be expressed as definite integrals rather 
than summations. This draws attention to the need for tests of significance of 
the magnitude of all the above measures, especially those of overlapping, in 
order to allow for sampling fluctuation. Obviously, when the frequency func- 
tions are of the asymptotic type some amount of overlapping will always exist. 





ON A PROBLEM OF ESTIMATION OCCURRING IN PUBLIC OPINION 
POLLS 


By Henry B. Mann 
Ohio State University 


To arrive at an estimate of the number of electoral votes that will be cast for 
a presidential candidate a poll is taken of \;N interviews in the 7th state (¢ = 1, 
--+ , 48) where the \; are fixed constants > 0 such that 2A; = 1 and the re- 
spondent is asked ior which candidate he intends to cast his vote. To estimate 
the number of electoral votes which candidate A will receive, the electoral votes 
of all the states in which the poll shows a majority for candidate A are added 
and their sum is used as an estimate for the number of electoral votes which 
candidate A will receive. In this paper certain properties of this estimate will 
be discussed. It will be shown that it is a biased but consistent estimate and 
an upper bound for the bias will be derived. Finally we shall derive that dis- 
tribution of interviews which minimizes the variance of our estimate. 

In all that follows we shall consider the poll as a random or stratified random 
sample and shall disregard the bias introduced by inaccurate answers. Our 
results however remain valid as long as the sampling variance is proportional 


1 
to wn’ 
We shall use the following notation: 
m=; = proportion of voters in the 7th state who intend to vote for candidate A. 
e=1 ifm>3 
0 ifa;<34 


w; = number of electoral votes of the 7th state. 
pi, e: = sample values of 7; and e; resp. 
We shall further exclude the case 7; = 3. 
The number of electoral votes for candidate A is then given by 


a EW = Z 
As an estimate of T we use the quantity 
(1) tae és Wy = G. 


Let p; be the probability that p; > 3 and hence e; = 1. Let AN = N; be the 
number of interviews in the 7th state. If N; is not too small then p; is given by 


. 1 —(r—1;) 2/202 
(2) p= | oer ae. 


85 



















86 HENRY B. MANN 






In this formula o; = s(t #0 if the sample is an unstratified random 
sample and may be somewhat less if the sample is a stratified random sample.’ 


1 
VN" 





For our purposes it is sufficient to assume that o; is proportional to 


We then have E(e;) = p; and 


(3) E(G) = EQ i= €; Wi) = pb Mery pi Ui. 
Hence G is a biased estimate of T. On the other hand’ plim p; = 7; and 
No 
hence plim e; = ¢; and therefore plim G = T. That is to say G is a con- 
N-0o N-0 


sistent estimate of I. 
According to (8) the bias is given by 


(4) BIN) = Doiztew: — Diz ow; = DIZ (a — po) wi. 
We have 


] [ 
a —}2?2 . 1 
E-p7 = - OS € dx if m<4 

” V 2x (—75)/0; F , 


1 (j—7 5) /o; 12? 
aie - 
&— a = é dx if > 2, 
t Pi a/ 2a [ i 2 


: ’ 1 
For a stratified as well as for an unstratified sample o; is proportional to VN, 


and we therefore put 





(5) = We = vi N;_ if Ti < 4 
Ci —yi:iVN; if m >} 
Then we have in both cases 
1 “ iis 
: a pi —_—_<S- cual d i 
(8) la~al= a - a 


We have for a > 0 
I oh dz she + er + toh +...) 
< or h(1 “4. em + g™ + ee ) 


= et" 


for every value h. 





Since lim -———=a, = ~— we have 
h—0 1 —_ ¢ a 


2 —ja2 
(7) / et dr< — for every a > 0. 
a 

1The variance in public opinion polls is somewhat larger than the random sampling 
variance due to the fact that a cluster sample is used and not a random sample. For the 
same reason the estimate p; of x; may be biased. 

2 For the notation used here see: H. B. MANN anv A. WALD, ‘“‘On stochastic limit and 
order relationships’. Annals of Math. Stat., (1943), pp. 217-227. 


A PROBLEM OF ESTIMATION 


From (6) and (7) we obtain 

en ttiNs 
8 le-p| < ——. 
‘ V24N; Yi 


From (4) and (8) we have 


—ly2Nn; 
i=48 ee 


1 
V 20 a ™ Vi Ni 


Formula (9) is valid whenever 7; ~ 3 and shows that B(N) converges rapidly 
to 0 for all values 7; ¥ 3. 

To obtain an approximate idea of the magnitude of the bias we may in (4) 
replace ¢; and p; by their sample values e; and r;. The quantity >°ix{* w; 
(e; — ri) can, however, not be regarded as an estimate of B(N). 

We now proceed to compute the standard error of G. We may consider the 
poll as 48 single experiments where the probability of success in the ith experi- 
ment is given by p; where 


(9) |B) |< 


” [ on? de = JP if m<4 
V 2e Sa iVir ; l—p if m=> 3° 
Hence the variance of G is given by 


(10) o = )oixt* op (1 — pwr. 


As an estimate of o° we can use the quantity S’ obtained by replacing p; by 
its sample value. 

We shall consider that distribution of interviews as best which minimizes 
E\(G@ — ry’). 

We have 


E(G — ry] =o + BN) 


We therefore Tr the problem of minimizing o° + B°(N) under the restric- 
tion Doici® N 


We have 











88 HENRY B. MANN 


Hence applying the method of Lagrange operators, we obtain 


2 27N ya. 
Ale EB NN = 9% wodws( — 2pi) — 2B(N)] =, G=1--- 48, 


Dict’ N; = N. 
The parameters y; and 7; in equation (11) can be estimated from a previous 
poll.’ It is not certain that (11) has always solutions. However if the quantity 
o + B’(N) has a minimum for a set of values Ni, --- , Nig with N; #0 (i = 1, 
--- , 48) then (11) must have a solution 
One might be induced to try to estimate = pw; directly by using r; = 


1 e 2 

~— 2 . > . . 
SS | e * ? dx as an estimate of p;. It is easy to see that 7; is a con- 
V 2 Gi-—pi)/si 








(11) 


sistent estimate of e;. It will be shown however that this estimate is more 
biased than the estimate (1). 

Since o; differs only very little from its sample estimate s; we may replace this 
sample estimate by o;. We then have 


1 +00 Cs) 

= -)2/9¢2 _ De —m7 5) 2/92 
9 » | (/ _ “\dr)e — ih 
TO; s-w 4 


2 


1 i a) 0 

—[(r—p;)2+(p;—7;) 2) /207 
cad ferent ge 
TO; 0 4 


1 2/9”2 
Br) = (sie ff cnr mta) 








Now 





(x — pi)’ + (i — mi) = ._= +2 ( - sis) . 


1 S sa -) 21402 _ a —hargt \2ia2 
Bd = aaa fy of rr apa: 


The second integral is equal to +/ To" . Hence 


7 1 . —(x—m ;) 2/402 1 . —x2/2 
E(r;) = 2a/ no? , @ ‘dz = \/an € dx. 


Hence 





(3-75) /oivV/2 

3 If w; for any z were very close to 4 then it would be of little use to poll the 7th state. 
Hence, in this case formula (11) gives a small value for N;. However, the 7; are never 
accurately known. The following procedure might be recommended for determining the 
best distribution of interviews: If for one particular ¢ the sample value of 2; as estimated 
from a previous poll is too close to } determine, using the NV; of the previous poll, that value 
a; of x; for which the probability is ;4 that p; is larger than 3 and substitute in (11) 7; 
for x;. In all other cases substitute the sample value. 

If several polls are taken it is advisable to use all of them but the last one to estimate 
as closely as possible the values of the z;. The sample of the last poll before the election 
should be distributed according to (11). 





A PROBLEM OF ESTIMATION 89 


From (12) we see that E(ri) < pi if 7; > 4 and E(r:) > pi if < 3. 
Thus in every case this estimate is more biased than the estimate (1). 

On the other hand, we shall now show that E[(e; — r;)’] is always smaller than 
E{(e: — e:)"]. Since e; = 1 if m; > 4 and e; = O if 7; < } it is easy to verify that 
E{(e: — 1:)"] has the same value for 7; = a as for 7; = 1 — a and the same is true 
for E[(e; — e,)"]. We may, therefore, without loss of generality assume’ that 
mi < 3. 

Thus we have to show that 


(13) E(ri) < EC) = p: = bs e* dx if m; < }. 
—5)/o5 


We have 


. , 1 E 
Eri) V/ 240i [. —_ V2 , 2) ap. 


1 a, ae 1 —(1/202) Q(x, y,P;) 
= vel. J J —" sitet 





Now 


Q(x, y, pi) = (@ — pi)” + (y — pi)” + (p: — 75)” 
2 
3 6 2 
Putting 


x ety tna 
__ ¥8(n- 3 ) gt = 2 ety — 2m) 
Pi ’ /6 P 


ot oi 


, 1 @—y 1- 2m; 


=_— — 


ia a we 





we obtain 


2 1 és ow ae —}p’? —32/2 — —hy'2/2 ‘) ’ , 
E(ri) = (\/2n)° [ [ ee ( e dy’ } dx’ dp 


V/3(a—z’) 

1 eo zt ae 

= € 
2m Ja a/3(a—z) 


Now for 7; = 3 we have a = 0, and for 7; < } we havea > 0. Fora = 0 we 
obviously have E(r; < E(e?). Further lim E(r3) = lim E(e?) = 0 hence (13) 


a~>o ao 


ew dy dx. 


is proved if we can show that 


a V3 (z—2) 
F(a) = BG?) — E@) = 5-[ * | 


—hy? . 2. , —}x? 
€ dy dxj—- >= | e” dz 
a/2(a—z) LV 2 


Vie 





90 HENRY B. MANN 


is a monotonically increasing function of a. Differentiating F(a) with respect 
to a we obtain 


dF (a) = awl —}(4r2—6azr+3a2) > - v3 el 4)a2 
da T be i. 
-v3 e Al#a? [ ete GlAa)® ae v3, —(3/4)a2 
T a iJ. 


= e 4)a? fe = de oo V3 eH? 
a io.” 


9° 


Hence for a > 0 we have 


dF » =v? et 7 V3. ei?” > 0. 
da = 2/ x 
Hence we have proved 
2 ; /3(z~\a}) m _ 
El(e; = ri)’ _ 5 / oo [ e” dydx< El (e = e:) |, 
™ J \a\ 


V3(\a|—z) 


(15) 


Since 
El(e. — e:)'| - 7s - ri)’ 
is largest when 7; = 3 we also have 
] 1 i =i +/3z = 
El — ri] > |e — pi| — E “gk € ‘1. € ” ay de | 


or 


wa +/3z 1,2 
(6) le-—al>HMe-wl> = oe | e'” dy dx — |4 — pi \. 


2m Jo /32 


Because of (15), 7; although more biased may in many cases be preferable 
to e; as an estimate of ¢;. 





NOTES 


This section is devoted to brief research and expository articles, notes on method- 
ology and other short items. 


a 


A COMBINATORIAL FORMULA AND ITS APPLICATION TO THE 
THEORY OF PROBABILITY OF ARBITRARY EVENTS’ 


By Kai-Lar CounG Anp Lietz C. Hsu 


National Southwest Associated University, Kunming, China 


An important principle, known as a proposition in formal logic or the method 
of cross-classification can be stated as follows.’ 

Let F and f be any two functions of combinations out of (v) = (1, 2, ---, 7). 
Then the two formulas 


(1.1) F(@)) = | >) _i(@) + @) 


) € (v)—( 
(2.1) f(@))= Dy  (-1)’F(@) + (8) 
(8B) € (v)—(a) 


are equivalent. 


As an immediate application to the theory of probability of arbitrary events, 
we have the set of inversion formulas’ 


(3.1) p((a)) = i 2d , Pl(@) + (8)] 


€ (v)—(a 


(4.1) pl(a)] = a ae ; (—1)’p((@) + (8)) 


) € (v)—(a 


where p((a)) is the probability of the occurrence of at least Ea,, Ea,, +--+, Ea, 
out of n arbitrary events £,, E.,--- , E, and p[(a)] is the probability of the 
occurrence of Ea, , Ea,,--- , Ea, and no others among the n events, (a1, a2, 
‘++, @a) denoting a combination of the integers (1, 2,---,m). They can be 
made to play a central réle in the theory, since they supply a method for con- 
verting the fundamental systems of probabilities, p[(a)] and p((a)), one into the 
other. 

We may further generalize (1.1) and (2.1) by considering combinations with 
repetitions. Let such a combination be written as 


(a) = (a) = (aj'an? +++ a") 


1 For the notations and definitions see K. L. Cuuna, ‘‘On fundamental systems of prob- 
abilities of a finite number of events,’’ Annals of Math. Stat., Vol. 14 (1943), pp. 123-133. 

2 Cf. Fr&écuet, Les probabilités associées @ un systéme d’événements compatibles et dépen- 
dants, Hermann, Paris (1939), formulas (55) and (58). 


91 





92 KAI-LAI CHUNG AND LIETZ C. HSU 
where r; (7; = 1) denotes the number of repetitions of the number a;, i = 
1, 2,---,a. Correspondingly we write 

(a)’ = (aya ene Qa) 


and call it the reduced combination corresponding to (a). 


If there are n distinct elements (1, 2, --- ,) in question, we may write every 
combination in the form 


(1712"? ... n’*) 


where each r; is zero or a positive integer. We say that (1°'2"? --- n°") belongs 
to (1712 ---n™) and write 


(1°22 .-- n°) ¢ (1712? ... n™) 
if and only if for each 7,7 = 1, 2,--- ,n, we have s; < r;. We write 
(1°22%* «~~ 97) 4 (1D? «- - %) mm (pt grates... ?aten). 
and if (1°12°? --- n°") € (1712 --- n’), 
(1°*Q"* --- nm) — (1°2" .-- i) me (1° DE... Fe), 


We define a generalized Mobius function u((@)) for combinations (with or with- 
out repetitions) as follows 


Hla) = OF = 


if (a) ¥ (a)’. 
This function has the property 
\, _ 1 if (a) = (0) 
atria HB) = 9 if (a) # (0). 
For we have 


. t= Sf (= S-v' (7) 


(B) € (a) (8) € (a)’ 
1 if a =0_ 1 if @=(0) 
0 if a ~0O O if () + (0). 
Now we state and prove the following general theorem. 
TuHrorEeM. Let (a); = (aijjtaii?--- afit) and (vy); = (1122... “ 
where \;; and n; are finite and 1 < rij < diz, 1 < ai < n; fori’ = 1,2, +--+ ,m 
andj = 1, 2,---,n;. Then for any two functions of the m onion: (with 


repetitions), (a)1, (ao, -°++, (aw)m out of (v)1, (v)2,--:, (vm, the two sets of 
formulas: 


F((a); ; (a)2 af ee (a) m) 
= 2 S((a@)1 + (8)1, (a2 + (B)2,-**; (alm + (B)m) 


(8B); € (¥)i—(a); 


(1) 





A COMBINATORIAL FORMULA 


and 


S((a)i , (a2, +++ 5 (am) 
‘ 2a - | I Hao | F((a)1 + (8)1, (a)2 + (B)2, «++, (am + (B)m) 
are equivalent. 
Proor. To deduce (2) from (1) 


i u(9)) | F(a): + Bas +++ (m+ Bm) 


(3); € (vy) i-—(@); 
yu | 11 n((o)s | | 
(8): € (¥)i—(a)i Lit=1 (yi € (Y) ia) §—(8) 5 
‘S((a)i + (B)i + (Vis +++, (@)m + (B)m + (y)m) 
DG f(a + a, +*+s (em + (5)m) 
(6)5 € (¥) G—(a)y 


> [Lx -— wd. 


(y)¢ € (8)5 t= 


Evidently we have 


™ ™ 


i IT w(@: — @)) = IT § a w((@): — Gat 


(y)¢ € (6)5 t—1 i=l 


- IT § y; Mma i 1 if (6); _ (0) for 1 = Be ree. ae 


cite 0 otherwise 


(ya € (5)5 


by the property of the u-function. Hence the preceding sum reduces to 
f((a)i, +++, (a)m) in accord with (2). 

(1) is deduced from (2) in a similar way. 

Although the general case is not without importance in the treatment of 
several sets of events,’ we shall for the sake of convenience restrict ourselves to 
the special case m = 1. 

In order to apply these formulas we must first introduce combinations with 
repetitions into the theory of arbitrary events. This can be done in various 
ways. Firstly, we may consider the number of occurrences of each event in a 
given time-interval or in a series of trials not necessarily independent. Secondly, 
we may regard each event as possessing various degrees of intensity. If the 
event E£; occurs 7; times in a given time-interval or occurs with 7; degrees of 
intensity, we write it as Ej‘. Hereafter we shall make use of the first interpreta- 


3 Cf. Frécnet, Loc. Cit. pp. 50-52; also, K. L. Cuwune, ‘Generalization of Poincaré’s 
formula in the theory of probability,’’ Annals of Math. Stat., Vol.14 (1943). We may note 


that our general theorem may be used to give another proof of the generalized Poincaré’s 
formula for several sets of events. 





94 KAI-LAI CHUNG AND LIETZ C. HSU 


tion and we shall assume that the maximum number of occurrences of each event 
is finite: 


We define 
pl[Ei' --- Ev] = pl{(v’)] = The probability that EZ; occurs exactly r; 
times in the given time-interval. 
p(Ey' --- EY) = p((’)) = The probability that Z; occurs at least r; 
times in the given time-interval. 
These quantities play the same réle as the p[(a)}’s and p((a))’s in the ordinary 
theory. Evidently the probability of every complex event in question can be 
expressed as the sum of certain p[(v’)]’s. To prove that the p((v’))’s also form 
a fundamental system of quantities we have only to express p[(»’)]’s in terms of 
the p((v’))’s. This is given immediately by an application of the general 
theorem with m = 1. For we have in an obvious way 
p(Ei! --- E%) = 2. pIE! --- BE’) 


TESteSi 
or 


(3) PO) = 2 MC + OM= Dao. 


vy) € (vA)—(v7) € (vA- 


Hence we obtain the inversion 


(4) plir’)] = ae Ho) 0’) + (v')). 


v8) € (vA)—(v" 


Let (a’) denote a running combination without repetitions. Then since u((»*)) = 
0 unless (y*) is a (’), 


4) plOM= De wlla’))(") +’) = 


(a’) € (rs 


D,.., (-D'@) + @)) 


aS oF 


The set of formulas (3) and (4) generalize (3.1) and (4.1). 
Corresponding to the pra)((v)) for the ordinary events we define for a + b + 
= n and?, s,--- all distinct: 
Pta}r,{0}2,.-. (Ey! --- EX") = The probability that among n events E,, Ep, 
--- , E, exactly a events occur r times, exactly b events occur s times and so on. 
By (4) we easily obtain 


Pair, tb)2,---((»»)) 
=>) a —w((7))p((07) + (a)’ + (8)? + +++) 


S (v®) € (v§)—((a)+(8)#+++-) 
where (a)" = (E%, --- E%,), (8)° = (Es, --- Eg,), --- and the first summation 
is a symmetric sum which extends to all n!/a!b! --- different combinations 
(a1 +++ aa), (Bi +--+ Bo), +++ out of (vy) = (1,2--- 2). 
The equality (5) is obviously a generalization of Poincaré’s formula. 
Similarly for the probabilities in the definition of which the word “exactly” 


(5) 





MECHANICS OF CLASSIFICATION 95 


is sometimes substituted for the words “at least.’ Of course we can express 
all of them in terms of the p[(v’)]’s or of the p((v’))’s. However elegant formulas 
such as in the ordinary theory seem to be lacking. 

Finally, we may also consider conditions of existence for the p[(v’)]’s and the 
p((v'))’s. For the former system the conditions are that they be all non-negative 
and that their sum be 1. For the latter system, the conditions are given by 
(4'), viz. for every (v’) ¢ (y*), 


De, ua’) (@") + (@)) 2 0. 


") € (vA 


(a 


These conditions are necessary and sufficient since (3) and (4) are equivalent. 


ON THE MECHANICS OF CLASSIFICATION 


By Car. F. Kossack 
University of Oregon 


1. Introduction. Wald’ has recently determined the distribution of the 
statistic U to be used in the classification of an observation, z; (¢ = 1, 2, --- , p), 
as coming from one of two populations. He also determined the critical region 
which is most powerful for such a classification. It is the purpose of this paper 
to show how such a classification statistic under the assumption of large sampling 
can be applied in an actual problem and to present a systematic approach to the 
necessary computations. 

The data used in this demonstration are those which were obtained from the 
A.S.T.P. pre-engineering trainees assigned to the University of Oregon. The 
problem considered is that of classifying a trainee as to whether he will do un- 
satisfactory or satisfactory work’ in the first term mathematics course (Inter- 
mediate Algebra). The variables used in the classification are: (1) A Mathe- 
matics Placement Test Score. This is the score obtained by the trainee on a 
fifty-minute elementary mathematics test (including elementary algebra). 
The test was given to each trainee on the day that he arrived on the campus. 
(2) A High School Mathematics Score. A trainee’s high school mathematics 
record was made into a score by giving 1 point to students who had had no high 
school algebra, 2 points to students with an F in first-year, high-school algbra 
and no second-year algebra, 3 points for a D, --- , 10 points for an average grade 
of A in first- and second-year algebra. (3) The Army General Classification 
Test Score. An individual needed a score of 115 or better in order to be assigned 
to the A.S.T.P. These data were obtained for 305 trainees along with the actual 


1 ABRAHAM WALD, “‘On a statistical problem arising in the classification of an individual 
into one of two groups,’’ Annals of Math. Stat., Vol. 15, (1944), No. 2. 

2 Unsatisfactory work was defined as a grade of F or D in the course (failure or the lowest 
passing grade). 


Specialized Teain ing Frogram 
G 





96 CARL F. KOSSACK 


grade made by them in the algebra course. Trainees who had had college work 
were not included in the study. 


2. Steps in the Computation of U and the Critical Region. Let 

m, be the population of individuals who do unsatisfactory work in their first- 
term mathematics course. 

m2 be the population of individuals who do satisfactory work. 

N, and N2 = respectively the number of observed individuals in m and 7. 

Liq and Yia = respectively the Mathematics Placement Test Score for the 
ath individual observed in m and 7. 

Xeq and Yoo = respectively the High School Mathematics Score. 

Z3a and Yzq = respectively the Army General Classification Test Score. 

Step 1. Computation of Summations 


Ni = 96 Ne = 209 
> t1a = 3570 Dye = 11450 


L412 = 547 Ty2a = 1567 
2Xsa = 11745 LYse = 26684 
Dx, = 145476 Vyia = 672452 
Taza = 3509 Lysa = 12577 
Data = 1439559 TVy3a = 3421996 
DLlalea = 21012 LYiaYra = 88774 
LLMal3a = 436964 LY1aY3a = 1469302 
aXeelsa = 66731 LY2aYsa = 200150 
D(a1e — #1)” = 12716.625 V(yie — Hi) = 45167.311 
D(tee — £2)” = 392.240 D(yee — Ge)” = 828.249 
=(xsa — 3) = 2631.656 Z(ysa — Js) = 15125.876 
Z(X1a — F1)(L2a — Fo) = 670.438 Tyre — H)(Y2a — Jo) = 2926.392 
Z(Lie = ¥1) (tq — 3) = 196.812 L(Yre = I) (Ysa — 93) = 7427.359 
Z(Xea — F2)(tsa — %3) = —191.031 Lyra — He)(Ysa — Js) = 83.837 


Step 2. Computation of Statistics. 


37.188 j: = 54.785 
5.6979 jo = 7.4976 
122.3438 Js = 127.6746 


= 2(tia — Fi) (Bia — Fj) + ZYia — Y¥s)(Ysa — Y;) 
Nit M2 —2 


4.0280 $13 = 25.162 
58.606 So = —.35378 





MECHANICS OF CLASSIFICATION 


Step 3. Computation of Inverse Matriz | si | 


| 191.04 11.871 25.162 | 
| sis] =| 11.871 4.0280 — 35378 | = 34053 
| 25.162 —.35378 58.606 | 


11 
§ 


0069286 = — 020692 
g 31019 — .0030996 
s*® = .018459 s* = 010756 


Step 4. Computation of the Classification Equation. 
U = [s"i — Hh) + 82 — 2) + 8°(Gs — %))-z1 
+ [s"i — 41) + 8°G2 — ¥2) + 8°(Gs — 4s)]-22 
+ [8G — 41) + 8°(G2 — H2) + 8°Gs — Gs)]-2 


where z; plays the same role for individuals to be classified as x;. and yi. do for 
observed individuals. 


U = .068160 2: + .25147 z. + .063215 z; 
Step 5. Computation of the Critical Region (assuming W, = We) 
& = .068160 # + .25147 Z + .063215 Z; = 11.702 
& = .068160 7 + .25147 F + .063215 9; = 13.691 
3(Q1 + &) = 12.696 
Therefore, 


For U < 12.696 classify the individual as coming from 7; population. 
For U > 12.696 classify the individual as coming from 72 population. 
Step 6. Computation of the Efficiency of Classification. 
P= s\H%-— (HA —-A+ 9° HA —- We —&) + 9°GH — AG — 
+ 8G. — %)(G1 — Hh) + 8° (Ge — %) (G2 — He) + (He — H)(Gs — 2 
+ s"(Gs — 3)(G1 — Hs) + 8° (Gs — Fs)(Ge — He) + 8° (Gs — 4) (Gs — 2 
= 1.5764. 
de — Gi _ 
— .792 


Pj=1-P,= == [ e'* = 2062 
/ 2x .792 


where P; is the probability of making an error of Type I, that is, of classifying 
an individual as one who will do satisfactory work when he actually does un- 
satisfactory work; and 1 — P; is the probability of making an error of Type II, 











98 T. A. BANCROFT 


that is, of classifying a student as one who will do unsatisfactory work when he 
actually does satisfactory work. 

3. Conclusions. In using the above classification equation to classify the 
305 trainees used in this study, 21 errors of Type I were made or 22.9 percent, 
while 50 errors of Type II were made or 23.9 percent. These percentages seem 
reasonably close to the expected 20.6 percent. 





NOTE ON AN IDENTITY IN THE INCOMPLETE BETA FUNCTION 


By T. A. BANcROFT 


Iowa State College 


Since the incomplete beta function has proved of some importance in statistics, 
it would appear that any additional information concerning its properties might 
at some time prove useful. In a paper by the author, [1], two identities in the 
incomplete beta function were incidentally obtained. They are as follows: 


(1) (p + g@I(p, g) = pl-(p + 1, q) + gp, q + 1) 

and 

(2) Ptat "LL, @ = pt I)™LAp + 2, q) + 2pglp + 1,9 + 1) 
+ (p + 1)"IAp, q + 2), 


Bp, q) 


where the incomplete beta function J,(p, q) = —— 
B(p, q) 


, ete., and (p + 1", 


etc. refer to the standard factorial notation. 
Written in the above form these two identities suggest a possible general 
identity to which they belong as special cases. The third special case suggested is: 


(p+ q+ 2)"I(p, q) = (p + 2)"I.(p + 3, 9) 
(3) + 3(p + 1)" 91(p + 2,¢ +1) + 3p(q + 1)™7.(p + 1, q + 2) 
+ (q + 2)"I.(p, q + 3). 
The general formula suggested is 


@) @+atn-V" Ld =O (")@+n—r- ye 


-qtr-D" Lipt+tn—rqt+n). 
To prove the general formula we write (4) as 


(5) @+atn-)"L 0,0 = L()@tn-r- 9 








rT 





pit Bip +n — 7,9 +7) 
att") Beta -nqtn. 























AN IDENTITY 





ste By expanding and simplifying it is easy to show that 
6) (pPtn—r—-1)""q@tr—1)"_@mt+tqtn—-1)” 
the Bip +n —1,q +7) B(p, 4) ’ 
t ‘ , 
i Using (6) the right hand side of (5) reduces to 
Pt+tqtn-)"< (") 
7 i Sn Bip +n—r,qd+ 1). 
(7) Bp, @) di |, ) Be q +7) 
N The summed function in (7) reduces to 
6) [ ea -2 w+ - af a = B60, 9), 
which proves the identity. 
‘ies, Although the general identity is quite simple to prove, it does not seem to 
ieht have appeared in the literature. 
the REFERENCE 
WS: 


[1] Bancrort, T. A. ‘‘On biases in estimation due to the use of preliminary tests of sig- 
nificance,’’ Annals of Math. Stat., Vol. 15 (1944), No. 2. 





NEWS AND NOTICES 
Readers are invited to submit to the Secretary of the Institute news items of interest 


Personal Items 


Archie Blake is now employed as a ballistician with the Ballistic Research 
Laboratory at Aberdeen Proving Ground. 

Robert V. Bonnar is now employed as Associate Technologist at the Mare 
Island Navy Yard. 

Professor W. G. Cochran has returned to his regular duties at Iowa State 
College. 

Mrs. Bianca Cody (Bianca Rivoli) is now Statistician for the James O. Peck 
Research Company, 12 East 41st Street, New York City. 

Associate Professor William Feller of Brown University has been appointed 
Professor of Mathematics at Cornell University. 

Professor John Kenney of the University of Wisconsin is now located at the 
Milwaukee branch of the University. 

Myra Levine is now Assistant Mathematical Statistician with the Statistical 
Research Group at Columbia University. 

Mrs. Harold Michaelis (Ruth E. Jolliffe) is 5th Naval District Statistician 
at the Naval Operating base in Norfolk, Va. 

Emma Spaney is Statistician for the Committee on Measurement of the 
National League of Nursing Education. 

Professor J. A. Shohat of the University of Pennsylvania died October 8, 1944. 

Mr. Redford T. Webster of the Western Electric Company died July 31, 1944. 


New Members 
The following persons have been elected to membership in the Institute : 

Boddie, John B., Jr. Chief, Program Section, Budget Division, Washington, D.C. 2628 
Tunlaw Road, N.W. 

Bruner, Nancy M.A. (Iowa) Statistician, Western Auto Supply Co., Kansas City, Mo. 
7511 Main St. 

Christopher, Edward E. B.S. (Mass. Inst.Tech.) Statistician, Signal Corps. 5704 North 
26th St., Arlington, Va. 

Cowden, Dudley J. Ph.D. (Columbia) Prof. of Economics, Univ. of North Carolina. 
Box 515, Chapel Hill, North Carolina. 

Cynamon, Manuel M.S. (City Coll., N. Y.) Personnel Tech., Personnel Res. Sec., Adj. 
General’s Office, War Dept. 10 Ave. P, Brooklyn 4, N. Y. 

Evensen, Edward J. On military leave from Metropolitan Life Ins. Co. (Actuarial Sec.) 
Sv. Co., 1st Sp. Sv. Force. 

Green, EarlL. Ph.D. (Brown) Ist Lieut., A.C., Chief, Dept. of Statistics. AAF School 
of Aviation Medicine, Randolph Field, Texas. 

Groves, William Brewster B.S. (Antioch) Economist, Off. of Price Administration. 
§20 Decatur St., N.W., Washington, D.C. 


100 





\vw 


NEWS AND NOTICES 101 


Hornseth, Richard Allen M.A. (Wisconsin) Res. Assistant in Sociology, Univ. of Wiscon- 
sin. 207 N. Randall, Madison 5, Wis. 


Kinsler, David M. M.A. (Chicago) Chief, Analytical Section, Arms & Ammunition Divi- 
sion, Aberdeen Proving Ground, Maryland. 


Kopp, Paul J. M.A. (Duke) Major, Chemical Warfare Service, U.S. A. 1805 North 
Adams St., Arlington, Va. 


Massey, Frank Jones, Jr. M.A. (California) Associate, Dept. of Math., Univ. of Cali- 
fornia, Berkeley, Calif. 1364 Union St., San Francisco 9, Calif. 

Orcutt, Guy H. Ph.D. (Michigan) Instr. Economics Dept., Mass. Inst. of Tech., Cam- 
bridge, Mass. 


Rakesky, Sophie M.S. (Michigan) Statistician, W. K. Kellogg Foundation, Battle Creek, 
Mich. 


Roberts, Jean M.S. (Minnesota) Statistician, Child Welfare Res. Analyst. 929 Good- 
rich Ave., St. Paul &, Minn. 


Schietroma, William B.S.S. (Coll. of City of N. Y.) Research Assistant. 316 East 116th 
St., New York, N.Y. 


Schlorek, Mary A. A.B. (Adelphi) Research Statistician, National Broadcasting Co., 
30 Rockefeller Plaza, New York, N. Y. 


deSousa, AlvaroPedro B.E. (Liverpool) Vice-Governor, Banco de Portugal. Monserrate, 
Rua Infante de Sagres, Estoril, Portugal. 


Steele, Floyd George M.S. (Calif. Inst.of Tech.) Stat. Analyst, Douglas Aircraft. 18168 
Roosevelt Highway, Pacific Palisades, Calif. 
Thom, Herbert C.S. 6130 18th Rd., N., Arlington, Va. 


Report of the Fifth Pittsburgh Chapter Meeting 


The fifth meeting of the Pittsburgh Chapter of the Institute of Mathematical 
Statistics was held at Engineering Hall, Carnegie Institute of Technology on 
Saturday, November 25, 1944. The meeting was held as a joint session with the 
Pittsburgh Quality Control Society. Thirty-one persons attended the meeting, 
including the following six members of the Institute: 


George Eldredge, H. J. Hand, C. R. Mummery, E. G. Olds, E. M. Schrock, J. V. Sturte- 
vant. 


The following papers were presented, with Mr. J. V. Sturtevant, of the Car- 
negie Illinois Steel Corporation, acting as chairman: 


1. Modified Application of Control Chart to the Use of Gauges on Machine Tool Work. 
Dr. E. G. Olds, War Production Board, Washington, D. C. 

2. Application of Control Charts to Infrequent Inspection of Machine Operations. 
W. D. Angst, Thompson Aircraft Products Company, Cleveland, Ohio. 


3. Application of Control Chart Techniques to Checking Reproducibility of Chemical 
Analysis. 


H. A. Stobbs, Wheeling Steel Corporation, Steubenville, Ohio. 


4. Statistical Principles of Experimental Design as Applied to Tests Conducted in Manu- 
facturing Operations. 


Dr. B. Epstein, Westinghouse Electric & Manufacturing Co., East Pittsburgh, Pa. 


H. J. Hann, 
Secretary-Treasurer, Pittsburgh Chapter 





NEWS AND NOTICES 


Educational Meetings of the Pittsburgh Chapter 


The first of a series of educational meetings on methods of statistical computa- 
tions given by the Pittsburgh Chapter was held on Saturday afternoon, January 
20, 1945. Thirty-three persons attended the meeting, including the following 
three members of the Institute: 


Thomas A. Elkins, H. J. Hand, J. V. Sturtevant. 
The following program was presented: 


1. Potential Field for Industrial Applications of Statistical Method. 
H. J. Hand, National Tube Company, Pittsburgh, Pa. 
. Computations for Analysis of Variance and Experimental Design. 


Ben Epstein, Westinghouse Electric & Manufacturing Company, East Pitts- 
burgh, Pa. 


It is planned to hold these meetings bi-weekly, on Saturday afternoons for an 
indefinite period in the future. Topics to be considered in the series will include: 


. Analysis of variance and covariance. 

. Design of experiments. 

. Tests of significance. 

. Probability and probability distributions. 

. Correlation and regression analysis, including the orthogonal coordinate method. 
. Tests of increased severity. 

. Sampling theory, including stratification. 

. Acceptance-rejection mathematics, Dodge sampling inspection tables. 
. Shewhart control chart techniques. 

. Analysis of runs. 

. Cycle analysis. 

. Factor analysis. 


1 
2 
3 
4 
5 
6 
7 
8 
9 
0 


1 


ae 
Noe 


H. J. Hann, 
Secretary-Treasurer, Pittsburgh Chapter 





ANNUAL REPORT OF THE PRESIDENT OF THE INSTITUTE 


Continuing the established tradition, the annual summer meeting was held at 
Wellesley, Massachusetts, August 12-13, 1944 in conjunction with the Summer 
Meetings of the American Mathematical Society and the Mathematical Associa- 
tion of America. A regional meeting was held in Washington, May 6-7, in 
conjunction with the meeting of the Washington Chapter of the American 
Statistical Association. The programs were arranged by the Program Com- 
"mittee: W. Feller, Chairman, W. G. Madow, and A. Wald. 

Even though, under present war conditions, research in the field of probability 
and statistics is very much curtailed, enough papers in mathematical statistics 
of satisfactory quality have been proposed for publication in the Annals in 1944 
to keep the total volume of material at approximately five hundred pages or the 
level of the last few years. However, the outlook for a sufficient number of 
satisfactory papers to maintain the usual volume of publication during 1945 does 
not look quite so favorable. 

Looking into the future, the Institute must continue to furnish, through the 
Annais, a medium for the publication of all important results of original research 
in the field of mathematical statistics as they become available. To do otherwise 
would be suicide. At the same time we must take account of the growing need 
for comprehensive surveys of statistical theory on the part of other scientists, 
including not only social scientists but also physicists, chemists, biologists, and 
research engineers, whose interest in the contributions of mathematical statistics 
has been greatly stimulated during the war. Only the mathematical statiscian 
of broad competence can provide adequate critical surveys of this character. 
Perhaps some of this need can be met through survey articles published in the 
Annals, although it is not an easy matter to get capable men to do such work. 
Perhaps the time is not far off when the Institute must stimulate the preparation 
of such material by instituting an annual series of Colloquium Lectures patterned 
somewhat after those of the Mathematical Society, which could be published 
separately. 

This is but one of many problems that the Institute faces in its post-war 
development. Not only must it assume the responsibility of stimulating and 
encouraging research and of publishing the results; it must also consider the 
problem of training the research statistician of tomorrow as well as those who 
are to apply mathematical statistics in the many fields of science. It also must 
assume some responsibility for keeping in contact with other scientists in order 
that the mathematical statistician may become acquainted with the unsolved 
statistical problems of the scientist. There are also many problems of a pro- 
fessional character that face the mathematical statistician in the future if he is 
to succeed in developing the profession of mathematical statistics to the level 
attained by some of the older scientific professions. 

With the realization of the need for a concerted attack on some of these 

103 





104 REPORT OF THE PRESIDENT 


problems, the Board of Directors at its meeting in May set up two committees, 
one on Training and Placement of Statisticians under Harold Hotelling and the 
other on Post-War Development of the Institute under W. G. Cochran. In- 
terim reports received by the Board from both committees indicate that consid- 
erable progress has been made to date. They also indicate, however, that much 
more work remains to be done. 

At the same meeting of the Board, a Budget and Finance Committee was set 
up, consisting of P. S. Dwyer, Chairman, C. H. Fischer, A. C. Olshen, and C. F. 
Roos, to prepare a report on the policy that should be followed by the Institute 
in respect to such items as investment of funds, advertising, preparation of an 
annual statement, and the like. Some of the work of this committee has already 
borne fruit, as, for example, in providing the actuarial basis for life membership 
adopted at the Wellesley meeting and in establishing certain principles to be 
used in conducting the business of the Institute. 

A report of the Committee on Membership, W. G. Cochran, Chairman, P. S. 
Dwyer, and T. Koopmans, appears elsewhere in this issue of the Annals. Upon 
recommendation of this committee, the Board of Directors elected nine new 
fellows: Walter Bartky, C. I. Bliss, Gertrude M. Cox, P. A. Horst, M. G. Ken- 
dall, H. B. Mann, E. 8S. Pearson, Henry Scheffé, and W. A. Wallis. 

The nominating committee for the year consisted of John Curtiss, Chairman, 
E. G. Olds, and F. F. Stephan. G. W. Snedecor served the Institute again as its 
representative on the Council of the A.A.A:S. 

The annual election of the Institute just concluded by mail ballot resulted 


in the election of the following officers for 1945: W. E. Deming, President; W. G. 
Cochran, and J. L. Doob, Vice-Presidents. 


WattTerR A. SHEWHART 
President, 1944 
February 10, 1945 





ANNUAL REPORT OF THE SECRETARY-TREASURER 
OF THE INSTITUTE 


Accounts of the 1944 meetings of the Institute—the Wellesley meeting, the 
Washington regional meeting, and the Pittsburgh chapter meetings—have ap- 
peared in appropriate issues of the Annals. 

At the Wellesley meeting a number of amendments to the Constitution and 
By-Laws were passed. These were published in the September, 1944, issue of 
the Annals. (The amended Constitution and By-Laws appear elsewhere in this 
issue. ) 

Due to a large extent to the cooperation of the membership in sending in nom- 
inations, the Institute enjoyed a large increase in membership during the year. 
There were some resignations and it was necessary to suspend fifteen persons at 
the end of 1944 because of failure to pay dues. It is apparent that, in some of 
these cases at least, our mail is not being received. Undoubtedly some of these 
memberships will be restored when contact is again established. As of January 
1, 1945, there were 606 members, a net gain of approximately one hundred 
members. 

During the year the Institute received gifts from Professor Harry Carver in 
the form of exchanges for early issues of the Annals, reprints of early articles, etc. 

The Secretary-Treasurer wishes to acknowledge the continued assistance of 
Professor Lloyd Knowler in looking after the back issues of the Annals which are 
stored at Iowa City. 

The following financial statement covers the period from December 22, 1943 
to December 31, 1944 (the books and records of the Treasurer have been audited 
by Professor Thomas A. Bickerstaff and were found to be in agreement with the 
statement as submitted): 


FINANCIAL STATEMENT 
December 22, 1943, to December 31, 1944 


RECEIPTS 
BALANCE ON Hanp, DECEMBER 22, 1943 $3,715.05 
DveEs 
$2,995.31 
1,127.00 


4,452.31 
SUBSCRIPTIONS 


1944 and before $1,301.94 
1945 and 1946 


eg oe 
NS de td oct Ae can caren dllg wed duced Ghia tehi debian abious a Sed desaoltar 


Total Receipts $11,744.41 
























REPORT OF THE SECRETARY-TREASURER 


EXPENDITURE 





ANNALS—CURRENT 
NTI OMEN 6802 esau cc aed Saves Nes cttasoria atalino ea acters cdi anaes $273.77 
AM oooh 5 Poh erik cla Nott oues eae e esau ds ave eh tei ces dia celni itera lola 3,448.51 





ANNALS—Back NUMBERS 
Perenase seem Gl. ©. Carvel oicc.coic cick oadinsccdsdesacecewacess $149.40 
I ia acres Satine ak ie nihigu til amet mn acon dae me 96.26 








OFFICE OF SECRETARY-T REASURER 
Printing, mimeographing, programs, etc. (including stamped 


MRI MN coer GO coe tag atencnes fe irons Spey ogen Tantastic ens Sarmiiandnsia Him apes’ $377.00 
Pe TURNS NNER RNNRONOD oo oss 50 20s wave env fa arene nsw ies Sw plare wise 68 .02 
ead eae el ea aaa dd Ce acav eee oi mein 455.94 
WAGViIng OMCE THOM PUtSNUTON. og cou ici kiescccacwweesceces 55.79 




























956. 
NEAR ce rye oulsres igen stile Takes Ie Shes is LSI eh 29.0 
6 


BALANCE ON Hanp, DECEMBER 31, 1944...................000000- 6,790. 





$11,744. 





No unpaid bills were in the hands of the Treasurer as of December 31, 1944, 
and aside from an additional $100.00 which the Board has designated for Annals 
expense for 1944, there were no large bills outstanding. 

Accounts receivable as of December 31, 1944, amounted to $303.73. Many 
of these accounts are current accounts while some of the older ones are accounts 
with firms in India, which probably will be collected eventually. 

The American Library Association continued with its purchase of thirty sets 
of Volume XV of the Annals (for post war distribution) and the Universal Trad- 
ing Corporation (representing the Chinese Government) purchased twenty 
sets of Volumes 11-17 inclusive. These orders contributed in no small way to 
the total 1944 income of $8,029.36. 

The 1944 balance $6,790.65 (consisting of bank balance of $3,790.65 and 
$3,000.00 in government bonds) is $3,075.60 higher than it was on December 21, 
1943. This increase is due in part to 1944 business and in part to the fact that 
unusually large payments toward future business, such as the $330.00 in life 
payments and the $1,127.00 in 1945 and 1946 dues, have been made. 

To summarize the situation briefly, the Institute’s 1944 activity has resulted 
in a gain of approximately $1,500.00 and we are about this much in advance 
of our usual position with reference to the payments of following years. 

Pau 8. Dwyer 
Secretary-Treasurer. 
December 31, 1944 


REPORT OF THE MEMBERSHIP COMMITTEE OF THE INSTITUTE 


Since the duties of this Committee are not defined in detail in the Constitution, 
the Board of Directors asked the Committee to prepare a statement describing 
the appropriate composition and function of the Committee on Membership. 
This work resulted in the preparation of amendments to the Constitution and 
By-laws. These amendments were passed at the business meeting at Wellesley 
College on August 13, 1944, and are printed in full in the September, 1944, issue 
of the Annals (p. 340). 

In brief, the duties of the Committee are specified as follows in these amend- 
ments: 

(a) The Committee holds the power of election to the grades of Member and 
Junior Member and makes recommendations to the Board of Directors with 
reference to placing members in the other grades of membership. 

(b) It is the duty of the Committee to prepare and make available through 
the Secretary-Treasurer an announcement of the qualifications necessary for 
the different grades of membership and to review these qualifications periodically. 

(c) The Committee considers plans for increasing the number of applicants 
for membership. 

As permitted by the amendments referred to above, the power of election to 
the grades of Member and Junior Member was delegated by the Committee in 
August, 1944, to the Secretary-Treasurer, subject to certain reservations. The 
statement of qualifications for the different grades of membership as mentioned 
in (b) above is published below. At the August 13 meeting of the Board of 


Directors it was decided that no elections should be made at present to the grades 
of Honorary Member and Sustaining Member. 

On the recommendation of the Membership Committee the following members 
were elected as Fellows by the Board of Directors: W. Bartky, C. I. Bliss, G. M. 


Cox, P. A. Horst, M. G. Kendall, H. B. Mann, E. 8. Pearson, H. Scheffé, W. A. 
Wallis. 


Statement of Qualifications for the Different Grades of 
Membership in the Institute of 
Mathematical Statistics 


Member. The candidate shall either (a) be actively engaged in or show a 
serious interest in mathematical statistics, or (b) be interested in some applied 
field of statistics, with a desire to keep himself informed regarding recent develop- 
ments in mathematical theory and techniques. 

Junior Member. s 

1. Any undergraduate student of a collegiate institution is eligible for election 
as a Junior Member of the Institute of Mathematical Statistics provided that he 
or she is sponsored by a member of the Institute. 

2. The annual dues ($2.50) must be submitted with the application. 

107 





108 REPORT OF THE MEMBERSHIP COMMITTEE 


3. Annual membership shall coincide with the calendar year and the Junior 
Member shall receive a complete volume of the Annals of Mathematical Statistics 
for the year in which he or she is elected. 

4. Junior Membership shall be limited to a term of two years, but a Junior 


Member may apply for transfer to ordinary membership at the beginning of his 
second year. 


Fellow. 

1. The candidate shall have evidenced continuing activity in research in 
mathematical statistics by publication beyond his doctor’s dissertation of in- 
dependent work of merit. Normally two or three worthwhile papers beyond the 
dissertation will be required to establish this fact. 

2. The first qualification may be partly or wholly waived in the case of (a) 
a candidate of well-established leadership among mathematical statisticians whose 
contributions to the development of the field of mathematical statistics other 
than sufficient published original research shall be judged of equal value or (b) 
a candidate of well-established leadership in the applications of mathematical 
statistics, whose work has contributed greatly to the utility of and the apprecia- 
tion for mathematical statistics. 

Honorary Member. A pexson of exceptional ability and acknowledged leader- 
ship in the field of mathematical statistics may be elected to the grade of Hon- 
orary Member by the Board of Directors, upon the recommendation of the 
Committee on Membership. 

Sustaining Member. The Board of Directors shall have the power to elect to 
Sustaining Membership any individual, group or corporation that is interested 
in furthering the purposes for which the Institute was formed. 

W. G. CocHran (Chairman) 
W. E. DEMING 
P. 8S. DwyrrR 


T. KoopMaANns 
February 10, 1945 





PROGRESS REPORT OF THE COMMITTEE ON POST-WAR 
DEVELOPMENT OF THE INSTITUTE 


In considering the post-war development of the Institute of Mathematical 
Statistics, the Committee has recognized two general problems: 

A. The problem of what additional activities the Institute should undertake 
in order to provide further stimulus to the development of the field of 
mathematical statistics. 

B. The problem of determining how the Institute can cooperate more effec- 
tively with the users of statistical techniques. 

Because of rapidly increasing interest in the application of statistical methods 
in many different fields, the Committee has directed most of its attention thus 
far to Problem B; the present progress report is concerned with the work of the 
Committee on this problem. The Committee hopes to submit a report on 
Problem A at the end of 1945. 

With respect to Problem B, it is the opinion of the Committee that a central 
organization for the statistical societies should be of common interest. Accord- 
ingly, a plan was worked out and submitted to the Board of Directors of the 
Institute at the Wellesley meeting of the Institute. This proposal and its. 
present status are discussed below. 

We believe that there is much to be gained from an organization that would 
form a link between the various statistical societies, and would have the following 
principal aims: 

(1) To represent the members of the societies in all matters of common interest. 

(2) To promote cooperation between statisticians working in the different 
fields of application, and between mathematical statistics, applied statis- 
tics, scientific research and the industries. 

(3) To develop amongst the public an appreciation of the value of the statisti- 
cal method in scientific inquiry. 

It is our opinion that an organization similar to that of the Institute of Physics: 
would ke suitable. The statistical societies, while retaining their present auton- 
omies, would become founding members of a corporation whose governing 
board would contain representatives from each society. In pursuance of its aims 
as outlined above, the new organization might: 

(a) Take the lead in formulating policies on questions which concern all 

statisticians. 

(b) Publish a journal of general interest to statisticians and undertake the 
routine work in connection with the publication of the journals of the 
individual societies, the societies retaining in full their present responsi- 
bility for the contents of their journals. 

(c) Arrange joint meetings between different statistical societies and between 
statistical and other scientific societies. 

109 











110 PROGRESS REPORT 

(d) Assist new groups in organizing for their benefit, either under the auspices 
of one of the present societies or in a new society, which might at first be 
given associate membership and later full membership of the central 
organization. 

(e) Take steps to bring news about the use of statistics in scientific research 
to the attention of the public and more particularly of leaders in industry, 
in federal, state and local agencies and in education. 

(f) Investigate the demands for various types and degrees of statistical 
training, outline courses of training in statistics suitable for meeting these 
demands and make strenuous efforts to have the recommended courses 
of training put into effect, in order that statisticians can be of fullest 
service in the nation’s work. In this connection an information and 
placement bureau may be an appropriate auxiliary. 

(g) Institute an abstracting service in statistical methodology. This might 
take the form of a periodical publication of abstracts of papers with respect 
to their methodological content rather than their subject matter. The 
coverage would include journals of business, marketing, engineering, 
medicine and agriculture as well as purely statistical publications. 

The financial needs of the new organization, which would maintain a paid 
full-time staff, may be met initially by contributions from the present societies. 
In view of the extra services which would be rendered to statisticians, some 
increase in the subscription rates of the present societies appears reasonable. A 
member who belongs to more than one of the present societies would pay the 
extra amount only once. Supplementary income might be derived from ad- 
vertising in the journal of the central organization and from the establishment 
of sustaining or corporate memberships in the central organization. 

At the time of the Wellesley meeting of the Board, there had been only in- 
formal contacts between members of this Committee and members of other 
statistical societies. We considered it our first task to obtain some consensus of 
opinion from the standpoint of the Institute of Mathematical Statistics. Fol- 
lowing general approval by the Board of Directors of the Institute, members of 
the Committee discussed the proposal for a central organization with representa- 
tives of several other statistical societies. The American Statistical Association 
has a Committee to consider the future structure of the Association and this 
Committee brought the Institute proposal before the Board of Directors of the 
Association for action. As the oldest of the statistical societies, the American 
Statistical Association then invited participation in an intersociety committee 
by the Institute and nine other societies or sections, directly or indirectly con- 
cerned with statistical method. This committee is to explore the possibilities of 
coordinating the activities of the several statistical societies and report its 
recommendations back to each organization. The representatives have now 
been named and the first meeting was held on February 10, 1945, in New York. 
At this meeting the Institute was represented by W. G. Cochran and Lt. John 
H. Curtiss. 








PROGRESS REPORT lil 


With regard to the problem of what additional activities the Institute should 
undertake in order to furnish additional stimulation to the development of the 
field of mathematical statistics, the Committee has discussed several ideas which 
appear promising. It is hoped to present a complete report on this phase of the 
Committee’s work at the end of this year. 

C. 1. Buiss 
W. G. Cocuran (Chairman) 
W. E. Dremine 
P. S. OLMsTEAD 
S. S. Winks 
February 12, 1945 

















CONSTITUTION 
OF THE 
INSTITUTE OF MATHEMATICAL STATISTICS 


ARTICLE I 


NAME AND PURPOSE 


1. This organization shall be known as the Institute of Mathematical Statistics. 
2. Its object shall be to promote the interests of mathematical statistics. 


ARTICLE II 


MEMBERSHIP 

1. The membership of the Institute shall consist of Members, Junior Members, Fellows, 
Honorary Members, and Sustaining Members. 

2. Voting members of the Institute shall be (a) the Fellows, and (b) all others, Junior 
Members excepted, who have been members for twenty-three months prior to the date of 
voting. 

3. No person shall be a Junior Member of the Institute for more than a limited term as 
determined by the Committee on Membership and approved by the Board of Directors. 


ARTICLE III 


OFFICERS, BOARD OF DIRECTORS, AND COMMITTEE ON MEMBERSHIP 


1. The Officers of the Institute shall be a President, two Vice-Presidents, and a Secre- 
tary-Treasurer. The terms of office of the President and Vice-Presidents shall be one year 
and that of the Secretary-Treasurer three years. Elections shall be by majority ballots at 
Annual Meetings of the Institute. Voting may be in person or by mail. 

(a) Exception. The first group of Officers shall be elected by a majority vote of the in- 
dividuals present at the organization meeting, and shall serve until December 31, 1936. 

2. The Board of Directors of the Institute shall consist of the Officers, the two previous 
Presidents, and the Editor of the Official Journal of the Institute. 

3. The Institute shall have a Committee on Membership composed of a Chairman and 
three Fellows. At their first meeting subsequent to the Adoption of this Constitution, the 
Board of Directors shall elect three members as Fellows to serve as the Committee on 
Membershir, one member of the Committee for a term of one year, another for a term of 
two years, and another for a term of three years. Thereafter the Board of Directors shall 
elect from among the Fellows one member annually at their first meeting after their elec- 
tion fer a term of three years. The president shall designate one of the Vice-Presidents as 
Chairman of this Committee. 


ARTICLE IV 
MEETINGS 


1. A meeting for the presentation and discussion of papers, for the election of Officers, 
and for the transaction of other business of the Institute shall be held annually at such 
time as the Board of Directors may designate. Additional meetings may be called from 


112 





INSTITUTE OF MATHEMATICAL STATISTICS 113 


time to time by the Board of Directors and shall be called at any time by the President 
upon written request from ten Fellows. Notice of the time and place of meeting shall be 
given to the membership by the Secretary-Treasurer at least thirty days prior to the date 
set for the meeting. All meetings except executive sessions shall be open to the public. 
Only papers accepted by a Program Committee appointed by the President may be pre- 
sented to the Institute. 

2. The Board of Directors shall hold a meeting immediately after their election and 
again immediately before the expiration of their term. Other meetings of the Board may 
be held from time to time at the call of the President or any two members of the Board. 
Notice of each meeting of the Board, other than the two regular meetings, together with a 
statement of the business to be brought before the meeting, must be given to the members 
of the Board by the Secretary-Treasurer at least five days prior to the date set therefor. 
Should other business be passed upon, any member of the Board shall have the right to 
reopen the question at the next meeting. 

3. Meetings of the Committee on Membership may be held from time to time at the call 
of the Chairman or any member of the Committee provided notice of such call and the 
purpose of the meeting is given to the members of the Committee by the Secretary- 
Treasurer at least five days before the date set therefor. Should other business be passed 
upon, any member of the Committee shall have the right to reopen the question at the 
next meeting. Committee business may also be transacted by correspondence if that 
seems preferable. 

4. At a regularly convened meeting of the Board of Directors, four members shall con- 
stitute a quorum. Ata regularly convened meeting of the Committee on Membership, 
two members shall constitute a quorum. 


ARTICLE V 


PUBLICATIONS 


1. The Annals of Mathematical Statistics shall be the Official Journal for the Institute. 
The Editor of the Annals of Mathematical Statistics shall be a Fellow appointed by the 
Board of Directors of the Institute. The term of office of the Editor may be terminated at 
the discretion of the Board of Directors. 


2. Other publications may be originated by the Board of Directors as occasion arises. 


ARTICLE VI 


EXPULSION OR SUSPENSION 


1. Except for non-payment of dues, no one shall be expelled or suspended except by 
action of the Board of Directors with not more than one negative vote. 


ARTICLE VII 
AMENDMENTS 


1. This constitution may be amended by an affirmative two-thirds vote at any regularly 
convened meeting of the Institute provided notice of such proposed amendment shall have 
been sent to each voting member by the Secretary-Treasurer at least thirty days before the 


date of the meeting at which the proposal is to be acted upon. Voting may be in person or 
by mail. 





BY-LAWS 
BY-LAWS 


ARTICLE I 


DUTIES OF THE OFFICERS, THE EpiTor, Boarp oF DirEcTorRS, AND COMMITTEE ON MeEm- 
BERSHIP 


1. The President, or in his absence, one of the Vice-Presidents, or in the absence of the 
President and both Vice-Presidents, a Fellow selected by vote of the Fellows present, shall 
preside at the meetings of the Institute and of the Board of Directors. At meetings of the 
Institute, the presiding officer shall vote only in the case of a tie, but at meetings of the 
Board of Directors he may vote in all cases. At least three months before the date of the 
annual meeting, the President shall appoint a Nominating Committee of three members. 
It shall be the duty of the Nominating Committee to make nominations for Officers to be 
elected at the annual meeting and the Secretary-Treasurer shall notify all voting members 
at least thirty days before the annual meeting. Additional nominations may be sub- 
mitted in writing, if signed by at least ten Fellows of the Institute, up to the time of the 
meeting. 

2. The Secretary-Treasurer shall keep a full and accurate record of the proceedings at 
the meetings of the Institute and of the Board of Directors, send out calls for said meetings 
and, with the approval of the President and the Board, carry on the correspondence of the 
Institute. Subject to the direction of the Board, he shall have charge of the archives and 
other tangible and intangible property of the Institute, and once a year he shall publish in 
the Annals of Mathematical Statistics a classified list of all Members and Fellows of the 
Institute. He shall send out calls for annual dues and acknowledge receipt of same; pay 
all bills approved by the President for expenditures authorized by the Board or the Insti- 
tute; keep a detailed account of all receipts and expenditures, prepare a financial statement 
at the end of each year and present an abstract of the same at the annual meeting of the 
Institute after it has been audited by a Member or Fellow of the Institute appointed by the 
President as Auditor. The Auditor shall report to the President. 

3. Subject to the direction of the Board, the Editor shall be charged with the responsi- 
bility for all editorial matters concerning the editing of the Annals of Mathematical Sta- 
tistics. He shall, with the advice and consent of the Board, appoint an Editorial Commit- 
tee of not less than twelve members to co-operate with him; four for a period of five years, 
four for a period of three years, and the remaining members for a period of two years, ap- 
pointments to be made annually as needed. All appointments to the Editorial Com- 
mittee shall terminate with the appointment of a new Editor. The Editor shall serve as 
editorial adviser in the publication of all scientific monographs and pamphlets authorized 
by the Board. 

4. The Board of Directors shall have charge of the funds and of the affairs of the In- 
stitute, with the exception of those affairs specifically assigned to the President or to the 
Committee on Membership. The Board shall have authority to fill all vacancies ad in- 
terim, occurring among the Officers, Board of Directors, or in any of the Committees. The 
Board may appoint such other committees as may be required from time to time to carry 
on the affairs of the Institute. The power of election to the different grades of Member- 
ship, except the grades of Member and Junior Member, shall reside in the Board. 

5. The Committee on Membership shall prepare and make available through the 
Secretary-Treasurer an announcement indicating the qualifications requisite for the differ- 





BY-LAWS 115 


ent grades of membership. The Committee shall review these qualifications periodically 
and shall make such changes in these qualifications and make such recommendations with 
reference to the number of grades of membership as it deems advisable. The power to 
elect worthy applicants to the grades of Member and Junior Member shall reside in the 
Committee, which may delegate this power to the Secretary-Treasurer, subject to such 
reservations as the Committee considers appropriate. The Committee shall make recom- 
mendations to the Board of Directors with reference to placing members in other grades 
of membership. The Committee shall give its attention to the question of increasing the 
number of applicants for membership and shall advise the Secretary-Treasurer on plans 
for that purpose. 


ARTICLE II 
DUES 


1. Members shall pay five dollars at the time of admission to membership and shall receive 
the full current volume of the Official Jqrnal. Thereafter, Members shall pay five dol- 
lars annual dues. The annual dues of Junior Members shall be two dollars and fifty cents. 

The annual dues of Fellows shall be five dollars. 'The annual dues of Sustaining Members 
shall be fifty dollars. Honorary Members shall be exempt from all dues. 

(a) Exception. In the case that two Members of the Institute are husband and wife 
and they elect to receive between them only one copy of the Official Journal, the annual 
dues of each shall be three dollars and seventy-five cents. 

(b) Exception. Any Member or Fellow may make a single payment which will be 
accepted by the Institute in place of all succeeding yearly dues and which will not otherwise 
alter his status as a Member or Fellow. The amount of this payment will depend upon 
the age of this Member or Fellow and will be based upon a suitable table and rate of inter- 
est, to be specified by the Board of Directors. 

(c) Exception. Any Member or Junior Member of the Institute serving, except as a 
commissioned officer, in the Armed Forces of the United States or of one of its allies, may 
upon notification to the Secretary-Treasurer be excused from the payment of dues until the 
January first following his discharge from the Service. He shall have all privileges of 
membership except that he shall not receive the Official Journal. However during the first 
year of his resumed regular membership he may have the right to purchase, at $2.50 per 
volume, one copy of each volume of the Official Journal published during the period of his 
service membership. 

2. Annual dues shall be payable on the first day of January of each year. 

3. The annual dues of a Fellow, Member, or Junior Member include a subscription to the 
Official Journal. The annual dues of a Sustaining Member include two subscriptions to 
the Official Journal. 

4. It shall be the duty of the Secretary-Treasurer to notify by mail anyone whose dues 
may be six months in arrears, and to accompany such notice by a copy of this Article. If 
such person fail to pay such dues within three months from the date of mailing such notice, 
the Secretary-Treasurer shall report the delinquent one to the Board of Directors, by whom 
the person’s name may be stricken from the rolls and all privileges of membership with- 
drawn. Such person may, however, be re-instated by the Board of Directors upon pay- 
ment of the arrears of dues. 





BY-LAWS 


ARTICLE III 


SALARIES 


1. The Institute shall not pay a salary to any Officer, Director, or member of any com- 
mittee. 


ARTICLE IV 


AMENDMENTS 


1. These By-Laws may be amended in the same manner as the Constitution or by a 
majority vote at any regularly convened meeting of the Institute, if the proposed amend- 
ment has been previously approved by the Board of Directors. 





MATHEMATICAL REVIEWS 


offers 


prompt review of the mathematical literature of the world 
complete subject and author indices in last issue of volume 
microfilm or photoprints of most articles reviewed (at cost) 


at the very low rate of $13 per year. 
Publication of this journal is sponsored by the American Mathematical Society, 
Mathematical Association of America, London Mathematical Society, Edinburgh 


Mathematical Society, Academia Nacional de Ciencias de Lima, Union 
Matematica Argentina and others. 


Send subscription order or request for sample copy to 


AMERICAN MATHEMATICAL SOCIETY 
531 West 116th Street, New York City 


JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


DeceMBER, 1944 + $1.50 PER Copy - $6.00 peER ANNUM ~- VoL. 39 + No. 228 





Wartime Developments in Agricultural Statistics 
H. R. To_titey AND Conrap TAEUBER 411 


Insurance in the International Balances of Payments......... S. J. LENGYEL 428 
Measurement of Industrial Production since 1939....... FRANK R. GARFIELD 439 
Economic Consumption Scales and Their Uses... RoperT Morse Woopsury 455 


The Distribution of Private, Non-Agricultural Employees in the United 
States by Straight-Time Hourly Wage Rates Davin R. Roserts 469 


A Simplified Calculation of the Potency of Penicillin and Other Drugs Assayed 
Biologically with a Graded Response C. I. Butss 479 


A Method of Analysis of Family Composition and Income 
T. J. Woorter, Jr. 488 


ERSUCT INTE CONAN SO GIES «is siseiisis aGaisi sb: kw nds dinysw sow Sars sererw e150 J. A. Norpin 497 
An Interpretation of the Quantity Index................. WarREN C. Warte 507 
Some Methods for the Evaluation of a Sum Leo A. Aroran 511 
Industrial Classes in the United States in 1940 TILLMAN M. Socce 516 
Een eer III oo. hd sac ab usenncin ye awa Cia SR RANE ROR Seeae 519 


Book Reviews 


AMERICAN STATISTICAL ASSOCIATION 
1603 K Street, N.W., Washington 6, D. C. 





BIOMETRIKA 


A Journal for the Statistical Study of Biological Problems 
Vol. XXXIII, Part II. August, 1944 


CONTENTS 

CON GULOTORTESBIVE LIMOS GOTIER.. .oo.6. 65 ic coos cee sseccsccvscsceseceeee M. G. KENDALL 
Comparison of the concepts of efficiency and closeness for consistent estimates of a 
parameter R. C. Geary 
The relation between measures of correlation in the universe of sample permuta- 
H. E. DANIELs 

The growth, survival, wandering and variation of the long-tailed field mouse, 
Apodemus sylvaticus H. P. Hacker and H. S. Pearson 
The control of industrial processes subject to trends in quality....L. H. C. Trpperr 
Studentization, or the elimination of the standard deviation of the parent popula- 
tion from the random sample-distribution of statistics H. O. Hartirey 


Miscellanea: 


Note on the use of the tables of percentage points of the incomplete beta function 
to calculate small sample confidence intervals for a binominal p 


Henry ScHEFFE 


The subscription price, payable in advance, is 45s. inland, 54s. export (per volume including postage). Cheques 
should be drawn to Biometrika and sent to ‘The Secretary, Biometrika Office, Department of Statistics, 
University College, London, W.C. 1."’ All foreign cheques must be in sterling and drawn on a bank 
having a London agency. 


ECONOMETRICA 


Journal of the Econometric Society 
VOL. 13, NO. 1 JANUARY, 1945 
Contents 


Cleveland Meeting Round Table, September 14, 1944 
ARTHUR SMITHIES: Forecasting Postwar Demand: I 
S. Morris Livineston: Forecasting Postwar Demand: II 
Jacos L. Mosak: Forecasting Postwar Demand: III 
Forecasting Postwar Demand: Discussion 


ZENON SzaTROwSKI: Time Series Correlated with the Beef-Pork Consumption 


Announcements, Notes, and Memoranda 

The Econometric Society is an international society for the advancement of economic theory in its 
relation to statistics and mathematics. 

Subscriptions to Econometrica and inquiries about the work of the Society and the procedure in 


applying for membership should " addressed to Alfred Cowles, Secretary and Treasurer, The Econ- 
ometric Society, The University of Chicago, Chicago 37, Illinois. 








